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1 3.  ABSTRACT (Matmjn  200  word.) 

The  subject  of  this  report  is  the  detection  of  weak  targets  in  a  strong  clutter 
environment.  Two  situations  arise  depending  on  whether  or  not  the  weak  targets 
can  be  separated  from  the  clutter.  For  both  cases  new  receivers  are  derived 
which  provide  significant  improvement  in  performance  over  other  recently  proposed 
techniques.  This  work  includes  development  of  an  adaptive  joint-domain  space-time 
processor,  effective  non-gausslan  weak  signal  detectors  based  on  spherically 
invariant  random  processes,  and  a  new  method  for  approximating  the  underlying 
probability  density  function  of  random  data  which  works  extremely  well  with 
only  100  camples. 


1 4. SUBJECT TERMS  Locally  Optimum  Detector,  Spherically  Invariant, 
Random  Processes,  Probability  Density  Function,  Weak  Signal 
Detector,  Radar,  Space-Time  Processing,  Clutter,  Non-Gaussian 
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Executive  Summary 

The  subject  of  this  report  is  the  detection  of  weak  targets  in  a  strong  clutter  environment. 
Two  situations  arise  depending  upon  whether  or  not  the  weak  targets  can  be  separated  from 
the  clutter.  For  both  cases  new  receivers  are  derived  which  provide  significant  improvement  in 
performance  over  other  recently  proposed  techniques.  This  work  includes  development  of  an 
adaptive  joint-domain  space-time  processor,  effective  non-Gaussian  weak  signal  detectors  based 
on  spherically  invariant  random  processes,  and  a  new  method  for  approximating  the  underlying 
probability  density  function  of  random  data  which  works  extremely  well  with  only  100  samples. 

When  the  target  and  clutter  are  separable,  space-time  processing  is  effective  in  detecting  the 
target.  In  effect,  this  approach  maximizes  the  signal-to-clutter  ratio  by  using  two-dimensional 
filters  on  the  joint  spatial  and  Doppler  spectra  to  isolate  the  target  from  the  clutter.  Furthermore, 
for  Gaussian  clutter,  space-time  processing  is  the  optimum  approach  for  detecting  weak  targets 
in  a  strong  clutter  background  whether  or  not  the  targets  and  clutter  are  separable. 

Unfortunately,  when  the  target  and  clutter  spectra  completely  overlap,  space-time  processing 
is  ineffective  in  detecting  weak  targets.  Nothing  can  be  done  to  improve  performance  for  the 
Gaussian  clutter  case.  However,  for  non-Gaussian  clutter,  effective  weak  signal  detectors  do 
exist.  Nevertheless,  this  is  an  area  which,  in  spite  of  its  importance,  has  received  relatively  little 
attention.  Much  of  this  report  is  devoted  to 

1.  the  characterization,  generation,  and  approximation  of  correlated  non- 
Gaussian  radar  clutter  samples  and 

2 


2.  the  design  and  performance  of  the  corresponding  weak  signal  detectors. 

Many  new  and  significant  results  are  discussed  in  this  report  and  are  summarized  below: 

(1)  An  adaptive  joint-domain  space-time  processor  is  derived  which 
not  only  outperforms  currently  proposed  space-time  processors 
but  also  converges  more  rapidly  and  processes  data  more  effi¬ 
ciently. 

(2)  Spherically  invariant  random  processes  (SIRPs)  are  shown  to  be 
an  attractive  approach  to  the  extremely  difficult  problem  of  mod¬ 
eling  correlated  non-Gaussian  random  variables.  Many  useful  and 
desirable  properties  of  SIRPs  are  derived  in  a  straight-forward  tu¬ 
torial  manner. 

(3)  To  make  it  possible  to  model  many  different  types  of  correlated 
non-Gaussian  clutter  (e.g.-  Weibull,  K-distributed,  Rician,  etc.) 
an  extensive  library  of  SIRPs  is  developed. 

(4)  To  enable  computer  simulation  of  correlated  non-Gaussian  radar 
clutter  samples,  which  are  needed  for  evaluating  receiver  perfor¬ 
mance,  two  different  canonical  generation  schemes  are  derived. 

(5)  Since  the  probability  distribution  underlying  clutter  is  not  likely 
to  be  known  in  advance,  a  new  method  for  approximating  the  uni¬ 
variate  probability  density  function  of  random  data  is  developed 
which  outperforms  existing  techniques  while  using  significantly 
fewer  data  samples. 

(6)  To  approximate  the  probability  distribution  underlying  the  N  cor¬ 
related  non-Gaussian  radar  returns  received  during  a  coherent 
processing  interval,  the  technique  developed  in  item  5  is  extended 
in  a  simple  manner  to  the  multivariate  probability  density  func¬ 
tion  arising  from  spherically  invariant  random  processes. 
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(7)  Weak  signal  receivers,  known  as  locally  optimum  detectors,  are 
derived  for  correlated  non-Gaussian  clutter  that  can  be  approx¬ 
imated  by  SIRPs.  These  detectors  are  shown  to  be  canonical  in 
form  and  combine  the  conventional  Gaussian  receiver  with  the 
appropriate  nonlinearity. 

(8)  Because  the  locally  optimum  detectors  are  nonlinear  and  involve 
non-Gaussian  inputs,  their  performance  must  be  evaluated  by 
Monte  Carlo  simulation.  A  technique  is  developed  for  determin¬ 
ing  the  receiver  thresholds  that  reduces  by  several  orders  of  mag¬ 
nitude  the  number  of  Monte  Carlo  trials  required. 

(9)  The  locally  optimum  detector  for  multivariate  Student-T  clutter 
is  shown  to  significantly  outperform  the  conventional  Gaussian 
receiver  when  the  target  and  clutter  spectra  completely  overlap. 
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Chapter  1 
Introduction 


The  subject  of  this  report  is  the  detection  of  weak  targets  in  a  strong  clutter  environment. 
Two  situations  arise  depending  upon  whether  or  not  the  weak  targets  can  be  separated  from 
the  clutter.  For  both  cases  new  receivers  are  derived  which  provide  significant  improvement  in 
performance  over  other  recently  proposed  techniques.  This  work  includes  development  of  an 
adaptive  joint-domain  space-time  processor,  effective  non-Gaussian  weak  signal  detectors  based 
on  spherically  invariant  random  processes,  and  a  new  method  for  approximating  the  underlying 
probability  density  function  of  random  data  which  works  extremely  well  with  only  100  samples. 
Many  new  algorithms  were  developed  for  this  purpose  and  resulted  in  extensive  new  software. 
In  a  companion  volume  the  quality  of  some  of  this  software  is  evaluated  and  discussed. 

Two  situations  arise  depending  upon  whether  or  not  the  weak  targets  can  be  separated  from 
the  clutter.  For  example,  consider  the  situation  illustrated  in  Figure  1.1  where  the  joint  spatial 
and  Doppler  spectra  of  the  received  radar  samples  are  shown  for  targets  7i  and  T2  and  a  single 
clutter  patch.  Obviously,  target  T\  can  be  separated  from  the  clutter  by  means  of  filtering 
whereas  the  target  T 2  cannot. 

When  the  target  can  be  separated  from  the  clutter,  performance  is  limited  by  the  background 
noise.  Assuming  a  large  signal-to-noise  ratio,  we  refer  to  this  as  the  strong  signal  case.  When 
the  target  and  clutter  overlap  and  the  clutter-to-noise  ratio  is  large,  performance  is  limited  by 
the  clutter.  Assuming  a  small  signal-to-clutter  ratio,  we  refer  to  this  case  as  the  weak  signal 
case.  Finally,  when  the  clutter  spectrum  partially  overlaps  the  target  spectrum,  performance  is 
limited  by  both  the  clutter  and  noise.  We  refer  to  this  situation  as  the  intermediate  signal  case. 
The  strong  and  intermediate  signal  cases  are  suitable  for  the  adaptive  joint-domain  space-time 
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Figure  1.1:  Illustration  of  Target  and  Clutter  Spectra 

processor  discussed  in  Chapter  2.  The  remainder  of  this  report,  Chapters  3-11,  are  devoted  to 
the  solution  of  the  weak  signal  case. 

1.1  Adaptive  Implementation  of  Optimum  Space-Time  Processing 

A  new  adaptive  algorithm,  called  the  Joint-Domain  Localized  Generalized  Likelihood  Ratio 
(JDL-GLR)  detection  algorithm,  is  presented  in  Chapter  2.  This  algorithm  takes  advantage  of 
the  fact  that  it  may  be  possible  to  separate  the  weak  target  from  the  strong  clutter  (interference) 
by  means  of  space-time  processing.  Specifically,  space-time  processing  transforms  the  received 
samples  in  space  and  time  to  a  two-dimensional  power  spectral  density  involving  both  spatial 
and  Doppler  frequencies.  The  spatial  frequency  is  a  function  of  the  angle  of  arrival  of  the 
radar  pulse  return  (interference)  plane  waves  with  respect  to  the  broadside  of  the  antenna  array 
while  the  Doppler  frequency  is  linearly  proportional  to  the  radial  velocity  of  the  object  from 
which  the  radar  pulse  is  reflected  (platforms  from  which  the  interference  is  emitted).  When  the 
radar  target’s  angle  of  arrival  and/or  radial  velocity  differs  significantly  from  those  of  the  clutter 
(interference),  it  is  possible  to  separate  out  the  target  return.  System  performance  is  then  limited 
primarily  by  the  background  noise.  Because  the  clutter  (interference)  environment  is  unknown 
a  priori  and  is  likely  to  change  with  time  and  spatial  position,  the  algorithm  must  be  adaptive 
with  a  sufficiently  fast  convergence  rate.  The  JDL-GLR  algorithm  presented  in  Chapter  2  is 
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both  data  and  computationally  efficient  and  converges  quickly  for  Gaussian  random  processes. 
Embedded  CFAR  and  robustness  in  non-Gaussian  clutter  (interference)  are  other  properties  of 
this  algorithm. 

1.2  Weak  Signal  Detection 

The  algorithms  presented  in  Chapters  3-11  were  developed  to  handle  the  case  for  which  it  is 
not  possible  to  separate  the  target  return  from  the  clutter  (interference).  In  other  words,  these 
algorithms  are  intended  to  be  applied  only  when  the  target  and  clutter  (interference)  spectra 
overlap  significantly.  We  refer  to  this  situation  as  the  weak  signal  problem.  For  this  problem, 
system  performance  is  limited  primarily  by  the  clutter  (interference).  Several  new  algorithms 
have  been  developed  for  the  weak  signal  detection  problem.  Although  these  algorithms  can  be 
used  to  combat  both  clutter  and  interference,  for  ease  of  discussion,  the  presentation  focuses 
only  on  weak  signal  detection  in  a  strong  clutter  background.  The  statistics  of  clutter  have  been 
observed  to  be  both  Gaussian  and  non-Gaussian.  Because  the  weak  signal  detector  for  Gaussian 
processes  is  identical  to  that  for  strong  signals,  only  the  non-Gaussian  case  is  considered  in 
Chapters  3-11. 

1.3  Literature  Review  on  Spherically  Invariant  Random  Processes 

In  general,  the  radar  receiver  receives  N  complex  (or  2 N  quadrature  component)  samples  from 
each  radar  resolution  cell.  To  develop  an  optimal  receiver,  it  is  necessary  to  have  a  closed  form 
analytical  expression  for  the  joint  probability  density  function  (PDF)  of  the  received  samples. 
When  the  N  samples  are  statistically  independent,  the  joint  PDF  is  simply  the  product  of  the 
marginal  PDFs.  However,  clutter  samples  are  likely  to  be  correlated.  Because  this  correlation  is 
useful  for  canceling  the  clutter,  it  is  important  that  the  correlation  be  modeled.  Unfortunately, 
when  the  received  samples  are  correlated  and  non-Gaussian,  there  are  no  unique  analytical 
expressions  for  their  joint  PDF.  A  search  of  the  mathematical  and  signal  processing  literature 
reveals  that  the  theory  of  spherically  invariant  random  processes  (SIRP)  provides  a  powerful 
mechanism  for  obtaining  the  joint  PDF  of  N  correlated  non-Gaussian  random  variables.  The 
literature  search  on  SIRPs  is  reviewed  in  Chapter  3. 

1.4  Radar  Clutter  Modeling  Using  SIRPs 

As  mentioned  previously,  the  clutter  is  unknown  apriori  and  is  likely  to  change  with  time  and 
spatial  position.  Consequently,  it  is  necessary  to  continuously  monitor  the  environment  in  order 
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to  determine  the  statistical  nature  of  the  clutter.  To  be  able  to  model  as  many  different  types 
r  clutter  as  possible,  a  large  library  of  multivariate  non-Gaussian  PDFs  is  necessary.  Based 
.1  the  properties  of  SIRPs,  a  library  of  joint  PDFs  is  developed  in  Chapter  4  for  correlated 
non-Gaussian  random  variables. 

1.5  Computer  Generation  of  Simulated  Radar  Clutter  Character¬ 
ized  as  SIRPs 

When  dealing  with  non-Gaussian  random  processes,  it  is  usually  difficult  ,  if  not  impossible 
to  analytically  evaluate  system  performance.  Performance  must  then  be  determined  by  means 
of  computer  simulation.  Two  canonical  procedures  are  presented  in  Chapter  5  for  generating 
correlated  non-Gaussian  random  variables  which  can  be  used  to  simulate  samples  from  SIRPs. 

1.6  A  New  Method  for  Univariate  Distribution  Approximation 

Because  the  clutter  environment  is  unknown  a  priori,  the  PDF  underlying  a  set  of  N  samples 
must  be  approximated  using  measured  samples  from  the  environment.  Chapter  6  describes  an 
algorithm  for  analyzing  univariate  random  data.  This  algorithm  has  two  modes  of  operation. In 
the  first  mode,  the  algorithm  performs  a  goodness-of-fit  test.  Specifically,  the  test  determines, 
to  a  desired  confidence  level,  whether  random  data  is  statistically  consistent  with  a  specified 
probability  distribution.  In  the  second  mode  of  operation,  the  algorithm  approximates  the  PDF 
underlying  the  random  data.  In  particular,  by  analyzing  the  random  data  and  without  any  a 
priori  knowledge,  the  algorithm  identifies  from  a  stored  library  of  PDFs  that  density  function 
which  best  approximates  the  data.  Estimates  of  the  scale,  location,  and  shape  parameters  of  the 
PDF  are  provided  by  the  algorithm.  Of  particular  note  is  the  observation  that  the  algorithm 
typically  works  well  with  small  sample  sizes  of  between  50  and  100  samples. 

1.7  Distribution  Approximation  of  Radar  Clutter  by  SIRPs 

As  noted  earlier,  the  N  complex  samples  received  from  each  radar  resolution  cell  are  character¬ 
ized  by  a  multivariate  PDF.  For  SIRPs,  it  is  shown  in  Chapter  7  that  the  multivariate  distribution 
approximation  problem  can  be  reduced  to  an  equivalent  univariate  distribution  approximation 
problem.  Consequently,  the  algorithm  of  Chapter  6  is  also  used  in  Chapter  7  to  approximate  the 
joint  PDF  underlying  N  correlated  non-Gaussian  clutter  samples  provided  they  are  generated 
from  an  SIRP. 
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1.8  Weak  Signal  Detection 

The  weak  signal  detection  problem  is  developed  in  Chapter  8.  Problems  encountered  in  the 
optimum  likelihood  ratio  test  (LRT)  are  pointed  out.  The  concept  of  the  locally  optimum  detector 
(LOD)  is  introduced  as  a  practical  detector  structure  for  the  weak  signal  problem. 

1.9  The  Locally  Optimum  Detector 

The  LOD  is  derived  in  Chapter  9  using  two  different  approaches.  Both  deterministic  and 
random  target  signals  are  considered.  It  is  shown  that  the  LOD  determines  whether  a  target  is 
present  or  not  by  comparing  a  statistic  computed  from  the  data  to  a  set  threshold.  The  receiver 
structures  are  specialized  to  the  case  ior  which  the  clutter  plus  noise  can  be  approximated  as  an 
SIRP. 

1.10  Determining  Thresholds  for  the  Locally  Optimum  Detector 

Not  only  is  the  clutter  assumed  to  be  non-Gaussian,  the  LOD  receiver  structure  is  non-linear. 
As  a  result,  system  performance  must  be  determined  by  means  of  computer  simulation.  The 
threshold  is  conventionally  determined  through  a  Monte  Carlo  procedure.  Unfortunately,  the 
number  of  trials  is  inversely  proportional  to  the  false  alarm  probability  Pp.  For  example,  when 
Pp  =  10"6,  a  minimum  of  ten  million  trials  need  to  be  generated.  To  avoid  carrying  out  so 
many  trials,  a  new  technique,  based  on  extreme  value  theory  is  presented  in  Chapter  10.  It  is 
demonstrated  that  fairly  accurate  thresholds  can  be  determined  for  false  alarm  probabilities  as 
small  as  10-7  with  as  few  as  5000-10,000  trials. 

1.11  Performance  of  the  LOD  for  the  Multivariate  Student-T  Dis¬ 
tribution 

Assuming  that  the  clutter  plus  noise  can  be  approximated  by  the  multivariate  Student-T 
distribution,  the  LOD  is  developed  in  Chapter  11  for  the  weak  signal  detection  problem.  The 
system  performance  is  evaluated  by  means  of  computer  simulation.  When  Pp  is  less  than  or 
equal  to  10~ 2 ,  it  is  shown  that  the  Gaussian  receiver  requires  a  signal  to  clutter  ratio  of  10-20 
dB  larger  than  that  required  by  the  LOD  for  the  same  values  of  Pd  and  Pp. 
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Chapter  2 

Adaptive  Implementation  of  Optimum 
Space-Time  Processing 


2.1  Introduction 

It  is  highly  desirable  for  an  airborne  surveillance  radar  system  to  have  the  optimum  or  near 
optimum  performance  for  detection  of  weak  targets  in  strong  clutter/interference  of  complicated 
angle-Doppler  spectrum.  As  the  clutter/interference  spectrum  is  unknown  to  the  system  and  the 
clutter/interference  environment  may  be  varying  in  both  time  and  space,  i.e.,  nonstationary  and 
non  homogeneous,  the  signal  processor  must  be  adaptive  with  a  sufficiently  fast  convergence  rate. 

Consider  a  system  which  employs  Na  spatial  channels  (subarrays  of  a  phased-array)  and  has 
Nt  pulses  in  its  Coherent  Processing  Interval  (CPI).  The  optimum  processor,  or  the  Neyman- 
Pearson’s  likelihood  ratio  test  for  such  a  system,  is  well  developed  in  [1]  under  the  assumption 
of  Gaussian  clutter/interference.  This  processor,  to  be  referred  to  as  the  joint-domain  optimum 
processor  in  this  paper,  has  the  highest  performance  potential  which  can  be  approached  by  adap¬ 
tive  algorithms  such  as  the  Sample-Matrix-  Inversion  (SMI)  [2],  the  Generalized  Likelihood  Ratio 
(GLR)  [3,  4],  and  the  Modified  SMI  [5,  6).  To  approach  this  detection  performance  potential, 
however,  these  algorithms  require  that  the  training  data  set  (i.e.,  the  so-called  secondary  data 
set)  have  at  least  2NaNt  ~  3NaNt  independent  and  identically  distributed  (iid)  data  vectors. 
Obviously  such  a  training-data  size  requirement  is  impractical  even  for  moderate  Na  and  Nt,  as 
the  environment  in  which  an  airborne  surveillance  system  operates  is  usually  severely  nonsta¬ 
tionary  and  nonhomogeneous.  Besides,  the  computation  load  can  easily  become  unbearable?  in 
practice  since  it  is  proportional  to  (NaNt)3.  One  should  also  note  that  lowering  Na  and  Nt  is  not 
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necessarily  desirable  in  practice  as  the  performance  potential  critically  depends  on  them  if  the 
angle-Doppler  spectrum  of  the  clutter/interference  is  complicated, 

A  much  more  popular  approach  to  space-time  processing  can  be  classified  as  cascade  processing 
with  either  the  beamformer- Doppler  processor  configuration  or  the  opposite  order  configuration. 
In  this  paper  the  former  will  be  called  the  Space-time  (S-T)  configuration  and  the  latter  the  Time- 
Space  (T-S)  configuration .  Obviously  the  optimum  detection  theory  can  be  applied  separately 
to  both  spatial  and  temporal  parts  of  both  S-T  and  T-S  configurations,  together  with  various 
adaptive  algorithms  available  for  each  part.  Of  course,  the  convergence  rate  and  computation 
load  problems  associated  with  adaptive  implementation  of  the  joint-domain  optimum  processor 
also  appear  with  the  cascade  configurations,  only  to  a  lesser  extent.  When  the  convergence 
does  occur,  the  performance  of  an  adaptive  implementation  with  the  S-T  (T-S)  configuration 
should  approach  that  of  the  optimum  processor  with  the  same  configuration.  Cascade  processing, 
especially  the  S-T  configuration,  has  been  so  popular  in  recent  years  that  it  seems  to  replace  the 
joint-domain  processor  in  the  airborne  surveillance  application.  Moreover,  arguments  can  often 
be  heard  about  which  cascade  configuration  has  higher  detection  performance  potential. 

The  first  objective  of  this  chapter  is  to  show  that 

(1)  neither  of  the  two  cascade  configurations  is  better  than  the  other,  and 

(2)  the  performance  potential  of  both  cascade  configurations  can  fall  far  below  that  of  the 
joint-domain  optimum  processor.  In  other  words,  we  show  that  if  one  wants  to  approach  the 
highest  performance  potential  offered  by  the  joint-domain  optimum  processor,  both  cascade 
configurations  should  be  avoided. 

As  pointed  out  earlier  in  this  section,  it  is  diffici  <,  in  practice  to  approach  the  performance 
potential  of  the  joint-domain  optimum  processor  with  the  straightforward  application  of  adaptive 
algorithms  such  as  the  SMI,  Modified  SMI,  GLR,  etc.,  especially  in  a  severely  nonstationary  and 
nonhomogeneous  environment,  even  if  the  heavy  real-time  computation  could  become  affordable. 
Therefore,  the  second  objective  of  this  chapter  is  to  develop  a  new  adaptive  algorithm  for  the 
joint-domain  optimum  processor,  which  should  be  much  more  data-efficient  and  computationally 
efficient  than  the  aforementioned  ones.  This  new  algorithm  is  an  extension  of  our  recent  work 
reported  in  [7,  8]  for  adaptive  Doppler-domain  processing. 

This  chapter  is  organized  as  follows.  We  will  first  formulate  the  data  model  in  Section  2.2. 
In  Section  2.3  we  will  compare  the  performance  potentials  of  the  cascade  and  joint-domain 
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processors.  The  new  adaptive  algorithm  for  the  joint-domain  optimum  processor  is  presented 
in  Section  2.4,  together  with  its  performance  analysis  and  comparison.  Finally,  Section  2.5 
summarizes  the  conclusions  with  some  discussion  of  related  issues. 

2.2  Data  Modeling 

Consider  a  narrowband  antenna  array  with  Na  spatial  channels  (subarrays).  Each  channel 
receives  Nt  data  samples  corresponding  to  the  return  of  a  train  of  Nt  coherent  pulses  for  a  given 
range  cell.  Let  the  column  vector  xtn#,  Nt  x  1,  represent  the  Nt  baseband  complex  (I/Q)  data 
samples  of  the  n,th  channel.  The  data  matrix  X,  Nt  X  N„  is  defined  by 


X  =  [xtl  x,2 


(2.1) 


where  “71”  denotes  the  transpose,  and  the  row  vectors  of  X,  xjnt,nt  =  1, 2, ...,  jVt,  are  the 
“snapshots”  obtained  along  the  spatial  channels. 

Under  the  signal-absence  hypothesis  Ho ,  the  data  matrix  X  consists  of  clutter/interference 
and  noise  components  only,  i.e., 

X  =  C  +  N  (2.2) 

where  C  and  N  represent  the  clutter/interference  and  noise,  respectively  ,  and  are  assumed  to  be 
independent.  Under  the  signal-presence  hypothesis  Hi,  a  target  signal  component  also  appears 
in  the  data  matrix,  i.e., 

X  =  aS  +  C  +  N  (2.3) 

where  a  is  an  unknown  complex  constant  representing  the  amplitude  of  the  signal  and  S  the 
signal  matrix  of  a  known  form.  We  call  X  the  primary  data  set  as  it  is  from  the  range  cell  under 
the  hypothesis  test. 

For  simplicity  of  discussion  only,  we  assume  that  the  spatial  channels  are  colinear,  identical, 
omni-directional,  and  equally  spaced  with  spacing  <f;  and  that  the  pulses  of  the  coherent  pulse 
train  are  identical  with  a  constant  Pulse  Repetition  Frequency  (PRF).  Under  these  assumptions, 
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the  ntn,th  entry  of  the  signal  matrix  S  has  the  following  form 


/  \  r.rt  ,  2u  ,n  .  ,.d sin0. 

-s(n<,na)  =  exp[i27r(n(  ^APRF  +  *27r(n*  ^  A  ’’ 

(2.4) 

where  v  is  the  radial  velocity  of  the  target,  0  the  direction  of  arrival  of  the  target-return  planewave 

with  respect  to  the  broadside  of  the  array,  and  A  the  radar  wavelength.  Denoting 

f  -  2V 

Jtt  APRF 

(2.5) 

as  the 

“normalized  Doppler  frequency”  of  the  target  signal,  and 

t  dsinO 

f“  ~  A 

(2.6) 

as  the 

“spatial  frequency”,  S  can  be  expressed  by 

S  -sj  ®s t 

(2.7) 

where  ®  is  the  Kronecker  product,  and 

St  =  [1  exp(i27r/at)  ...  exp(i2x(;Vt  -  l)/,t)]T 

(2.8) 

and 

sa  =  [1  exp(i27r/M)  ...  ex^{i2ir(Na  -  1  )faa)]T 

(2.9) 

are  the  signal  vectors  in  time  and  space  domains,  respectively.  We  assume  that  the  parameters 
PRF,  A,  and  d  have  been  properly  chosen  so  that  fat  and  faa  are  confined  within  [-  0.5,  0.5]. 

To  statistically  characterize  the  clutter/interference  and  noise  components  C  and  N,  we  in¬ 
troduce  the  notation  Vec(-)  for  a  matrix  operation  that  stacks  the  columns  of  a  matrix  under 
each  other  to  form  a  new  column  vector.  We  assume  that  the  NtNa  x  1  vector  Vec(C  +  N)  has  a 
multivariate  complex  Gaussian  distribution  with  zero  mean  and  a  covariance  matrix  R.  Under 
this  assumption,  xtTlJ,na  =  1,2, ...,  Na  and  x4n,,nt  =  1,2 will  also  be  complex  zero-mean 
Gaussian.  Let  Rj  and  R4  be  the  covariance  matrices  of  xtn<  and  x4n,,  respectively.  It  is  easy  to 
see  that  R*  and  R4  are  the  submatrices  of  R. 

In  the  cases  of  unknown  clutter/interference  statistics,  the  data  from  the  adjacent  range  cells, 
conventionally  referred  to  as  the  secondary  data  set,  are  also  needed  for  estimating  the  covariance 
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of  clutter/interference.  Under  both  H\  and  Hq)  they  consist  of  the  clutter/interference  and  noise 
components  only,  and  they  are  denoted  by 

Y*  =  C*  +  Nfc,  Nt  x  A  =  1,2 (2.10) 

where  K  is  the  number  of  range  cells  available.  We  assume  that  Y *,A  =  1,2,  ...A"  and  X  are 
independent  of  each  other  and  bear  the  same  clutter/interference  statistics,  i.e.,  Vec(Y*)  should 
also  have  a  complex-Gaussian  distribution  with  zero  mean  and  a  covariance  matrix  R. 

2.3  Difference  among  The  Performance  Potentials  of  The  Cascade 
and  Joint- Domain  Processors 

We  will  compare  the  detection  performance  potentials  of  the  two  cascade  configurations  and 
the  joint-domain  processor  under  the  assumption  that  the  clutter/interference-plus-noise  covari¬ 
ance  matrix  is  known.  With  the  known  covariance,  the  Space-Time  (S-T)  configuration  is  the 
./Vjth-order  optimum  spatial  processor  followed  by  the  Ntth-order  optimum  temporal  (Doppler) 
processor,  the  Time-Space(T-S)  configuration  takes  the  opposite  cascade,  and  the  joint-domain 
processor  is  the  N^A^th-order  optimum  processor.  Applying  the  result  in  [1]  to  the  above  three, 
we  list  the  optimum  weight  vectors  below  for  easy  reference. 

The  S-T  Configuration:  we  have 


=  cs,4_tR4  *8, 


(2.11) 


for  the  spatial  domain  weight  vector,  and 


(2. 12) 


for  the  temporal  domain  weight  vector,  where  ct  s..t  and  are  constants.  We  recall  that  R, 
and  R|  are  the  covariance  matrices  for  the  rows  and  columns  of  X,  respectively;  and  s4  and  st 
are  specified  by  Eq.(2.8)  and  Eq.(2.9).  The  test  statistic  is 


Vs-t 


-  ww 


Xw 


* 

a,a-t ' 


(2.13) 


10 


The  T-S  Configuration:  we  have 


(2.14) 


and 

=  c»,i-,[(I®wJ.()R(I®wm.<)]_1s( 

for  the  temporal  and  spatial  weight  vectors,  respectively.  The  test  statistic  is 


(2.15) 


i,-.  =  (2.i6) 

The  joint-domain  optimum,  processor:  the  whole  set  of  the  data  is  processed  all  together  by 
an  optimum  weight  vector  as 

Vj  —  Wj^Vec(X)  (2.17) 

where  w j  is 

W;  =  CjR-1  (2.18) 

with  cj  being  a  constant  scalar. 

One  should  note  that  the  overall  weight  vectors  for  the  two  cascade  configurations  can  have 
the  following  equivalent  expressions 


w,-(  =  (g>  WM_, 

(2.19) 

and 

W t-,  =  w(|,_,  <g>  W 

(2.20) 

The  squared  magnitude  of  the  test  statistic  is  compared  with 
determined  by  the  required  probability  of  false  alarm  Pj  as 

a  chosen  threshold  rjo  which  is 

»?o  =  -  In  Pf\ 

(2.21) 

and  the  signal  presence  is  claimed  if  the  test  statistic  surpasses  the  threshold. 

From  the  result  in  [1],  the  probability  of  detection  of  the  above  three  processors  has  the  same 
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form  below  with  their  own  weight  vectors,  i.e.,  wf_«,  w and  w j  to  replace  w  therein 

Pd  =  1  -  exp(~7)  f  exp(— <)Io(2\/7 t)dt  (2.22) 

Jo 

where 


.  ,,.WWSSWW 
7  =  I  O'!  - 


(2.23) 


wwRw 

and  I0(-)  denotes  the  zero-th  order  modified  Bessel  function  of  the  first  kind. 

The  key  to  achieving  the  objective  of  the  comparison  easily  is  to  identify  few  typical  cases, 
from  the  vast  number  of  varieties  of  clutter/interference  conditions,  which  are  also  simple  enough 
for  numerical  evaluation.  To  do  so,  the  following  specifics  are  necessary. 


(1)  The  covariance  matrix  of  the  receiver  noise  is  given  by 

E(Vec(N)Vec(N)w)  =  <r£l 


with  I  being  the  NtNa  x  Nt  N,  identity  matrix. 


(2.24) 


(2)  The  clutter/interference  is  assumed  to  have  a  two-dimension  power 
spectral  density  of  the  Gaussian  shape  centered  at  [ fct ,  fca ] 

P'U.J.)  =  "It— - - e»P[-((/,r_fr>I  +  (2.25) 


:  2ir<7t,a,, 


2a), 


2a), 


where  ft  and  f„  are  the  normalized  Doppler  frequency  and  spatial  fre¬ 
quency,  respectively,  and  (Tjt  and  crja  the  parameters  controlling  the 
spread  of  the  clutter/  interference  spectrum.  The  separation  between 
the  signal  and  the  center  of  the  clutter/  interference  spectrum  is  de¬ 
noted  by  A  ft  =  fat  -  fct  and  A/,  =  /„  -  fca. 

(3)  The  covariance  of  the  dutter/interference  corresponding  to  the  above 
spectrum  is  then  found  to  be 


E(Vec(C)Vec(C )")  =  <r2cCa  ®  Ct 


(2.26) 
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where  C,  and  Ct  are  Toeplitz  matrices  specified  by 
C<  =  Toeplitz {[1 

(2.27) 

and 

C,  -  Toeplitz {[1  ....  c-3(«r/.(^-»))J-'W.-»)W«]}) 

(2.28) 

respectively.  It  is  easy  to  verify  that  (1)  and  (3)  will  lead  to  R<  = 

<j2cCt  -f  a2nl  and  R.  =  <r2C.  +  a2n I. 

We  define  the  clutter/interference-to-noise-ratio  (INR)  and  signal-to-clutter/interference-plus- 
noise-ratio  (SINR)  by 

INR  =  -§•,  (2.29) 

an 

and 

|a|2 

SINR  =  j-~  '  (2.30) 

+  *1) 

Three  simple  cases  are  identified  below  in  each  of  which  at  least  one  of  the  cascade  configura¬ 
tions  suffers  severe  performance  degradation,  i.e.,  significantly  departing  from  the  joint-domain 
optimum. 

Case  1.  The  signal  and  interference  are  “weir  separated  in  the  angle  domain  (in  the  sense 
that  A f»>  1/N, )  but  close  to  each  other  in  the  Doppler  domain  (A ft  <  1  /Nt).  This  situation  is 
shown  in  the  subplot  in  Fig.  2.1.  The  detection  performance  vs.  SINR  for  the  three  processors 
are  plotted  in  Fig.  2.1  with  INR— 40  dB  and  Pf  =  10" 5.  The  S-T  configuration  shows  almost 
the  same  performance  potential  as  the  joint-domain  optimum  in  this  special  case,  while  the 
performance  loss  for  the  T-S  configuration  becomec  significantly  large. 

Case  2.  1’he  signal  and  interference  are  “well”  separated  in  the  Doppler  domain  but  close  to 
each  other  in  the  angle  domain,  as  indicated  by  the  subplot  in  Fig.  2.2.  The  T-S  configuration 
is  now  close  to  the  joint-domain  optimum  while  the  S-T  configuration  departs  significantly. 

Case  3.  The  clutter/interference  spectrum  has  two  peaks  with  one  close  to  the  signal  in  the 
angle  domain  while  the  other  in  the  Doppler  domain.  In  this  case  both  cascade  configurations 
fail  to  approach  the  joint-domain  optimum,  as  shown  in  Fig.  2.3. 

The  above  three  cases  are  typical  in  the  3ense  that  we  can  draw  from  them  the  following 
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Probability  of 


SINR 


Figure  2.2.  Performance  comparison  of  the  three  processing  configurations:  Case  2. 
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conclusions: 

(1)  neither  of  the  two  cascade  configurations  is  better  than  the  other,  and 

(2)  the  performance  potential  of  both  cascade  configurations  can  fall  far  below  that  of  the 
joint-domain  optimum  processor. 

Intuitively  the  above  conclusions  are  also  well  justified.  The  T-S  configuration  in  Case  1 
suppresses  the  signal  as  well  as  the  clutter/interference  as  they  have  little  separation  in  the 
Doppler  frequency  domain,  so  does  the  S-T  configuration  in  Case  2  in  the  angle  domain.  As  both 
Case  1  and  Case  2  can  appear  in  practical  situations  without  apriori  knowledge,  preselection  of 
either  cascade  configuration  is  thus  not  appropriate.  In  Case  3  the  signal  and  clutter/interference 
have  little  separation  in  either  of  the  two  domains,  which  results  in  the  failure  of  both  cascade 
configurations.  However,  the  separation  in  the  joint  domain  in  Case  3  is  still  sufficiently  large  to 
lead  to  the  success  of  joint-domain  optimum  processor.  As  an  airborne  system  has  to  deal  with 
clutter/interference  having  both  angle  and  Doppler  spectral  spread,  it  is  thus  important  to  make 
full  use  of  the  signal-clutter/interference  separation,  which  cannot  always  be  achieved  by  either 
of  the  two  cascade  configurations. 

Although  our  study  so  far  in  this  chapter  is  centered  around  the  detection  performance  poten¬ 
tials,  i.e.,  under  the  assumption  of  known  clutter/interference  statistics,  it  is  sufficient  for  us  to 
direct  our  attention  only  to  the  adaptive  implementation  of  the  joint-domain  optimum  processor, 
since  the  two  cascade  configurations  have  been  shown  to  have  limited  potentials.  This  will  be 
the  focus  of  the  remaining  part  of  this  chapter.  Before  we  proceed,  we  should  point  out  that, 
in  addition  to  the  problem  of  limited  potentials,  the  two  cascade  configurations  may  have  other 
serious  problems  of  practical  importance  which  are  associated  with  their  adaptive  implementa¬ 
tions,  e.g.,  the  difficulty  to  achieve  a  high-quality  Constant  False  Alarm  Rate  (CFAR).  This  issue 
will  be  briefly  discussed  later  in  Section  2.5  to  preserve  the  continuity  of  our  main  course. 

2.4  The  Joint-Domain  Localized  GLR  Algorithm 

As  pointed  out  in  the  introduction,  the  straightforward  application  of  available  adaptive  algo¬ 
rithms  such  as  the  SMI,  Modified  SMI,  and  GLR,  etc.,  has  considerable  difficulty  to  approach 
the  joint-domain  optimum  processor  in  practice,  especially  in  severely  nonstationary  and  non- 
homogeneous  environments.  Our  goal  here  is  to  develop  an  adaptive  implementation  which  is 
more  data-efficient  (in  the  sense  of  faster  convergence/requiring  fewer  training  data)  as  well  as 
more  computationally  efficient.  In  addition,  it  is  highly  desirable  in  practice  to  have  the  adap- 
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tive  algorithm  possess  an  embedded  CFAR  feature  and  a  low  sensitivity  to  the  deviation  of  the 
clutter/interference  distribution  from  the  assumed  Gaussian. 

To  achieve  the  above  goal  we  will  follow  the  idea  of  localized  adaptive  processing  as  presented 
in  [7,  8]  for  adaptive  MTD.  Although  this  idea  is  similar  to  that  of  beam-space  processing  in 
[9,  10,  11]  under  the  term  of  partially  adaptive  array  processing,  the  work  in  [7,  8]  distinguishes 
itself  from  the  previous  study  on  beam-space  processing  in  the  following  ways.  References  (7,  8] 
are  the  first  to  point  out  that  for  the  cases  of  the  limited  training-data  size  the  use  of  localized 
adaptive  processing  is  almost  mandatory,  and  they  have  shown  that  localized  adaptive  processing 
can  actually  outperform  fully  adaptive  processing  in  nonstationary  and  nonhomogeneous  envi¬ 
ronments.  Furthermore,  References  [7,  8]  are  also  the  first  to  study  localized  adaptive  processing 
with  the  detection  performance  measure,  which  is  of  course  the  primary  concern  of  surveillance 
systems.  In  contrast,  the  previous  work  on  beam-space  processing  focuses  on  the  steady  state 
performance  and  uses  the  signal  estimation  performance  measure.  As  the  primary  concern  of 
this  paper  is  again  detection  in  severely  nonstationary  and  nonhomogeneous  environments,  it 
is  natural  to  follow  the  work  in  [7,  8].  Of  course,  the  extension  represents  a  nontrivial  task  as 
indicated  by  the  complexity  of  the  joint  angle-  Doppler  domain. 

As  discussed  in  [7,  8],  the  localized  processing  idea  can  be  applied  with  a  variety  of  adaptive 
algorithms  such  as  the  SMI,  Modified  SMI,  and  GLR.  We  will  again  pick  up  the  GLR.  because 
it  offers  the  desirable  embedded  CFAR  feature  as  well  as  possesses  the  desirable  robustness  in 
non-Gaussian  clutter /interference  [5,  6].  Hence,  the  new  algorithm  presented  in  this  section  will 

be  called  the  Joint-Domain  Localized  GLR  (JDL-GLR). 

2.4.1  The  JDL-GLR  Principle 

Figure  2.4  illustrates  the  principle  of  the  JDL-GLR  processor  we  propose.  The  data  in  the 
space-time  domain,  X,  Nt  x  N„  is  first  transformed  to  the  angle-Doppler  domain.  This  multi¬ 
dimensional  transform  should  be  invertible  to  avoid  any  information  loss,  and  it  can  be  done 
most  conveniently  via  the  standard  two-dimensional  DFT  (which  is  linear  and  orthogonal)  under 
the  assumption  made  in  Section  2.2  for  the  spatial  channels  and  pulse  train.  One  should  note 
that  the  gaussianarity  assumed  for  X  will  not  be  affected  if  the  transformation  is  linear.  The 
angle-Doppler  domain  data  matrix  X,  Nt  x  Ns ,  represents  the  data  at  the  Nt  Doppler-bins  and 
Na  angle-bins  of  the  range  cell  under  the  hypothesis  test.  The  same  transform  is  also  performed 
on  the  secondary  data  Y^,  k  =  1,2,...,  A',  where  K  is  the  number  of  adjacent  iid  cells,  to  obtain 
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the  angle- Doppler  domain  secondary  data  yk,Nt  x  N„k  —  1,2, K. 

In  practice,  only  the  few  angle-bins  covering  the  angle  section  centered  at  the  broadside  of  the 
array  (i.e.,  around  the  look  direction  where  most  of  the  transmitted  energy  is  contained)  need 
to  be  tested,  while  at  most  all  Doppler-bins  should  be  tested  as  the  target  Doppler  frequency 
shift  is  unknown  to  the  processor.  Let  N,o  be  the  number  of  angle-bins  of  interest.  The  Nt  x  N, 
bins  to  be  tested  will  be  divided  into  L  groups,  each  of  which  contains  N,o  angle-bins  and  a 
small  number  of  adjacent  Doppler-bins.  An  example  for  this  grouping  is  given  in  Fig.  2.5  where 
Nt  =  24,  A,  =  12,  and  N,o  =  3.  We  note  that  the  number  of  Doppler-bins  in  each  group  needs 
not  be  the  same  and  that  some  overlap  can  also  be  justified.  The  purpose  of  dividing  along  the 
Doppler  axis  is  to  avoid  the  use  of  an  adaptive  processor  with  large  degrees  of  freedom,  which 
demands  a  large  training-data  set  as  well  as  a  large  amount  of  computation.  This  opportunity 
of  “divide-and-conquer”  is,  of  course,  made  available  by  the  multidimensional  transformation 
from  the  space-time  data  domain  to  the  angle-Doppler  domain,  which  decouples  the  degrees  of 
freedom  necessary  for  handling  complicated  clutter/interference,  from  the  number  of  data  points 
to  be  processed.  Based  on  our  experience  gained  from  the  work  in  [7,  8],  the  number  of  bins  in 
each  group  is  expected  to  have  only  minor  influence  on  the  detection  performance  and  should 
be  in  the  range  of  2  x  N,q  ~  4  x  Na0  in  general.  The  angle-Doppler  domain  secondary  data 
yk ,  k  =  1,2, ...,  K  should  be  grouped  in  the  same  way. 

Let  Nti  be  the  number  of  Doppler-bins  and  Nt  —  Nti  X  N„o  the  total  number  of  angle-Doppler 
bins  in  the  /th  group.  An  ,/V/th-order  GLR  processor  will  perform  the  threshold  detection  on  the 
Nt  bins  of  the  /  th  group  with  the  test  statistic 

_ |Vec(^a)"A,-1Vec(A-,)P _ ?  ,« 

,um  VecffiW  )«Rr'VM(5«  )(1  +  Vec(^)R,-‘Vec(^,)J  V° 

n  =  1,2,...,  Nti  m  =  1,2, ...,  Ns0  (2.31) 

where 

R,  =  £  Vec(ylk)Vec(ylk)H ,  (2.32) 

*=1 

and  Snm,Na  x  Nm,  is  the  signal-steering  matrix  in  the  angle-Doppler  domain  for  the  nmth  bin 
of  the  /th  GLR.  For  a  uniform  PRF  and  array  spacing,  it  is  easy  to  see  that  has  all  its 
entries  equal  to  zero  except  the  nmth  one  which  is  •y/./V(7V,.  We  note  that  the  threshold  rj^  need 
not  be  the  same  across  the  L  groups  as  evidenced  in  Subsection  2.4.2  below. 
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Multidimensional  data  from  airbom  phased-array 


Figure  2.4.  Blockdiagram  for  illustration  of  the  principle  of  the  Joint-Domain  Localized  GLR  Processor. 


Interference  location 


Figure  2.5.  An  JDL-GLR  example. 


2.4.2  The  JDL-GLR  Detection  Performance 

The  detection  performance  of  the  original  GLR  in  Gaussian  clutter/interference  is  given  in 
[3,  4]  with  deterministic  modeling  and  in  [12]  with  stochastic  target  modeling.  As  for  the  Doppler 
domain  localized  GLR  of  [7,  8],  it  is  straightforward  to  extend  the  results  in  [3,  4,  12]  to  obtain 
the  probabilities  of  detection  and  false  alarm,  Pj  and  Pf ,  of  the  JDL-GLR  with  both  target 
models.  Below  we  just  list  the  results  for  the  case  of  non-fluctuating  targets  with  the  trivial 
derivation  omitted. 

The  probability  of  detection  at  the  nmth  bin  of  the  /th  GLR  is  found  to  be 
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with  'JZ  being  the  covariance  matrix  of  Vec(-V/). 

The  probability  of  false  alarm  for  all  bins  in  the  /th  GLR  is  given  by 


/f  =  (  l-^y-w-1. 


(2.33) 


(2.34) 

(2.35) 

(2.36) 


(2.37) 


Obviously  the  probability  of  false  alarm  can  be  made  equal  across  the  L  groups  by  choosing 
different  T}q\1  =  1,2,  Eq.(2.37)  also  clearly  in  dicates  that,  like  the  original  GLR  and  the 

Doppler-domain  localized  GLR,  the  JDL-GLR  has  the  “integrated/embedded”  CFAR  feature  as 
Pjl\l  =  1,2, ...,  L  do  not  depend  on  the  covariance  of  the  clutter/interference. 
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2.4.3  Detection  Performance  Comparison 

Although  the  convergence-rate  advantage  of  the  JDL-GLR  can  be  seen  intuitively  from  the 
fact  that  the  localized  GLR’s  have  much  lower  degrees  of  freedom  than  a  high-order  GLR  directly 
applied  to  the  space-time  domain  data,  the  numerical  example  below  should  demonstrate  this 
advantage  clearly. 

Consider  a  system  with  Ns  =  12  and  Nt  =  24.  The  clutter/interference  is  assumed  to  have 
the  two-dimensional  multipeak  Gaussian-shaped  power  spectrum  density  (psd)  as  shown  in  Fig. 
2.6.  For  convenience  of  reference  we  have  also  indicated  the  center  locations  of  this  multipeak 
spectrum  in  Fig.  2.5.  The  exact  expression  of  this  psd  is  given  by 


pdfufs) = i>c2d 


1 


d=  1 


'liter  fto  fa 


=xp[— ( 


(ft-fetd)2  ,  ( ft-fc,d )2 


2  ojt 


2<7/j 


)] 


(2.38) 


where  ofj  is  the  power  of  the  dth  component.  Obviously,  the  total  clutter/interference  powc.  <72 
is 

(2.39) 

d=l 

We  set  =  <t22  =  a 24  =  cr2s  =  cr^  =  <t23/102'5,  INR=50,  and  SNR=0dB  which  gives  SINR 
~-50dB.  The  thresholds  for  the  processors  to  be  compared  are  such  that  every  processor  has 
a  probability  of  false  alarm  Pj  =  10-5  at  each  tested  bin.  We  assume  that  there  are  K  =  24 
adjacent  cells  from  which  the  iid  secondary  data  set  is  obtained. 

Consider  the  following  five  processors: 


(1)  the  joint-domain  optimal, 

(2)  the  JDL-GLR  with  L  =  7  localized  GLR  processors  with  their 
coverage  shown  in  Fig.  2.5, 

(3)  the  T-S  configuration  with  the  optimal  processor  for  each  part, 

(4)  the  S-T  configuration  with  the  optima'  processor  for  each  part, 
and 

(5)  a  conventional  beamformer  followed  by  the  optimal  temporal  pro¬ 
cessor  (i.e.,  the  optimal  MTI). 
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We  note  that  with  N,  =  12,  Nt  —  24  but  K  —  24  only,  any  straightforward  adaptive  imple¬ 
mentation  of  the  joint-domain  optimal,  any  adaptive  processor  with  the  S-T  configuration,  and 
any  adaptive  processor  with  the  T-S  configuration  will  fail  to  deliver  an  acceptable  detection 
performance  for  this  example  since  K  =  24  is  too  small  with  respect  to  their  degrees  of  freedom. 
Therefore,  these  adaptive  processors  are  excluded  from  the  above  list  for  detailed  comparison. 

Fig.  2.7  shows  the  probability  of  detection  of  the  five  processors  listed  at  the  6th  angle  bin 
which  is  the  assumed  angle  of  arrival  of  the  target  signal.  Obviously,  the  JDL-GLR  is  the  only 
one  that  approaches  the  joint-domain  optimal,  except  at  few  bins  adjacent  to  the  center  of  the 
strongest  clutter/interference  spectrum  component.  The  poor  performance  of  the  two  optimal 
cascade  configurations  should  not  be  a  surprise  from  the  discussion  in  Section  2.3.  The  fact  shown 
in  Fig.  2.7  that  the  ad.  hoc  processor  of  No.  5  can  outperform  them  (especially  the  optimal  S-T 
configuration)  is  also  a  strong  evidence  that  the  optimality  does  not  always  mean  much  with  a 
wrong  configuration.  Of  course,  the  poorest  performance  of  the  optimal  S-T  configuration  is  due 
to  the  fact  that  the  optimal  spatial  part  of  processing  nulls  the  clutter/interference  as  well  as  the 
target  signal.  Finally,  we  comment  that  a  CFAR  loss  is  inevitably  associated  with  any  adaptive 
implementation  of  the  four  optimal/partially  optimal  processor  in  Fig.  2.7,  while  the  embedded 

CFAR  feature  of  the  JDL-GLR  makes  any  other  additional  CFAR  processing  unnecessary. 

2.4.4  Other  Features  of  JDL-GLR 

The  computation  advantage  of  the  JDL-GLR  is  clear.  Recall  that  the  7V-th  order  GLR  has  a 
computation  load  proportional  to  N3.  Assume  that  each  localized  GLR  spans  three  angle-bins 
and  four  Doppler  bins  and  that  Nt/4  localized  GLR  are  required.  This  leads  to  a  computation 
load  proportional  to  (/Vt/4)(3  x  4)3  =  432iVt  for  the  JDL-GLR.  With  a  load  of  N3N3  for  the 
straightforward  application  of  the  GLR  to  the  space-time  domain  data,  the  JDL-GLR  will  show 
a  computation  advantage  when  Nt  >  4  and  Na  >  3.  For  large  Nt  and  Na  the  JDL-GLR  offers  a 
computation  load  reduction  by  a  factor  of 

7  =  N?N?/ 432.  (2.40) 

For  the  example  of  Nt  -  24  and  N„  =  12  in  this  section,  the  JDL-GLR’s  computation  load 
is  only  1/2304  of  that  for  the  straightforward  application  of  the  GLR  (or  SMI)  to  the  space- 
time  domain  data.  Like  the  Doppler-domain  localized  GLR  in  [7,  8],  the  JDL-GLR  can  further 
reduce  its  computation  load  via  deleting  the  localized  GLR  processors  for  the  region  where  the 
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Doppler  bin  No. 


Figure  2,7.  Detection  performance  comparison  of  the  five  processors 


detection  performance  improvement  is  unnecessary  or  impossible.  This  can  be  done  when  some 
apriori  information  is  available  about  the  power  concentration  of  the  clutter/interference  in  the 
angle-Doppler  domain.  Furthermore,  the  realization  of  the  JDL-GLR  benefits  from  the  available 
parallel  processing  techniques  as  its  localized  GLRs  all  operate  in  parallel. 

Since  the  robustness  feature  in  non-Gaussian  clutter/interference  resides  with  the  GLR  proces¬ 
sor  which  will  not  be  affected  by  the  linear  transformation,  the  JDL-GLR  is  expected  to  maintain 
its  robustness.  Computationally  intensive  simulation  is  being  conducted  to  confirm  this  feature 
and  the  result  will  be  published  separately  [13], 

2.5  Conclusions  and  Discussion 

This  chapter  shows: 

(1)  neither  of  the  two  cascade  configurations  is  better  than  the  other; 

(2)  the  performance  potential  of  both  cascade  configurations  can  fall  far  below  that  of  the 
joint-domain  optimum  processor;  and 

(3)  the  Joint-Domain  Localized  GLR  algorithm  (JDL-GLR)  offers  an  attractive  solution  to 
the  problem  of  approaching  the  performance  potential  of  the  joint-domain  optimum  processor  of 
a  high  order  ( N ,  x  Nt)  with  a  fast  convergence  rate  and  high  computation  efficiency,  together 
with  such  highly  desirable  features  as  the  embedded  CFAR  and  robustness  in  non-Gaussian 
clutter/interference. 

Finally,  we  would  like  to  point  out  that  both  cascade  configurations  may  have  considerable 
difficulty  to  achieve  a  high  quality  CFAR  in  practice  when  both  spatial  and  temporal  parts  are 
adaptive.  This  is  because  of  the  random  modulation  introduced  by  the  adaptive  algorithm  for 
the  early  part  of  the  cascaded  two  parts.  The  problem  may  become  more  severe  in  highly  non¬ 
stationary  and  nonhomogeneous  environments  where  there  is  a  shortage  of  a  sufficient  amount 
of  iid  training  data  to  smooth  out  the  extra  random  modulation.  In  contrast,  the  JDL-GLR 
presented  in  this  chapter  is  tree  of  such  random  modulation  and  can  maintain  its  CFAR  per¬ 
formance  with  a  much  smaller  amount  of  iid  training  data.  Simulation-based  comparison  of  the 
CFAR  performance  of  adaptive  spatial-temporal  processors,  can  be  found  in  [13]. 
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Chapter  3 


Literature  Review  on  Spherically 
Invariant  Random  Processes 

3.1  Introduction 

We  present  ail  overview  of  the  literature  as  it  pertains  to  the  modeling  of  radar  clutter  by 
spherically  invariant  random  processes.  In  addition,  relevant  mathematical  preliminaries  are 
presented  in  this  chapter.  When  a  radar  transmits  a  signal,  the  received  echo  may  consist  of 
returns  from  one  or  more  targets,  buildings,  trees,  water,  land  and  weather  depending  on  the  en¬ 
vironment,  The  target  returns  contribute  to  the  desired  signal  while  the  other  returns  contribute 
to  the  clutter.  Many  investigators  [14,  15,  16,  17]  have  reported  experimental  measurements 
for  which  the  clutter  probability  density  function  has  an  extended  tail.  The  extended  tail  gives 
rise  to  relatively  large  probabilities  of  false  alarm.  The  Gaussian  model  for  the  clutter  fails  to 
predict  this  behavior.  Two  approaches  have  been  used  to  explain  the  non-Gaussian  behavior. 
One  of  them  is  based  on  the  fact  that  the  assumptions  under  the  central  limit  theorem  (CLT) 
may  fail.  The  other  approach  is  based  on  the  nonstationary  reflectivity  properties  of  the  scanned 
areas.  In  any  event,  non-Gaussian  models  for  the  univariate  (marginal)  clutter  PDF  have  been 
proposed.  Commonly  reported  marginal  non-Gaussian  PDFs  for  the  clutter  are  Weibull  [14], 
Log-normal  [18,  19]  and  K-distributions  [16,  20,  15].  Second  order  statistics  for  these  models 
have  been  reported  in  terms  of  autocorrelation  functions  or  power  spectral  densities  [21,  17]. 

The  Weibull  [14]  and  Log-normal  [15]  models  for  radar  clutter  are  primarily  based  on  empirical 
studies,  while  the  K-distribution  has  been  shown  to  have  physical  significance  [22,  15]  in  that 
the  observed  statistical  properties  can  be  related  to  the  electromagnetic  and  geometric  factors 
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pertaining  to  the  scattering  surface.  Computer  simulation  schemes  for  Weibull  and  Log-normal 
clutter  based  on  the  univariate  PDFs  and  correlation  functions  have  been  developed  in  [23]  and 
[24],  respectively.  Extension  of  the  Weibull  and  Log-normal  and  K-distributed  clutter  models  for 
coherent  radar  processing  have  been  developed  in  [25,  18,  26]  respectively. 

Statistical  characterization  of  the  clutter  is  necessary  in  order  to  obtain  the  optimal  radar  signal 
processor.  Usually,  radars  process  N  pulses  at  a  time.  A  complete  statistical  characterization 
of  the  clutter  requires  the  specification  of  the  joint  probability  density  function  (PDF)  of  the 
N  samples.  When  the  pulse  returns  are  statistically  independent,  the  joint  PDF  is  simply  the 
product  of  the  marginal  PDFs.  However,  the  clutter  can  be  highly  correlated.  In  fact,  the 
correlation  between  samples  is  useful  in  canceling  the  clutter.  Consequently,  it  is  desirable  to 
include  the  correlation  information  in  the  multivariate  PDF.  For  non-Gaussian  processes  this 
can  be  done  in  more  than  one  way.  The  theory  of  spherically  invariant  random  processes  (SIRP) 
provides  a  powerful  mechanism  for  obtaining  the  joint  PDF  of  the  N  correlated  non-Gaussian 
random  variables.  Applications  for  the  theory  of  SIRPs  can  be  found  in  the  problem  of  random 
flights  [27],  signal  detection  and  estimation  problems  in  communication  theory  [28,  29],  speech 
signal  processing  [30,  31],  radar  clutter  modeling  and  simulation  [32,  26,  33,  34,  35].  The  following 
sections  provide  a  brief  overview  of  literature  on  the  theory  of  SIRPs. 

3.2  Definitions 

In  this  section  we  present  certain  definitions  and  mathematical  preliminaries  pertaining  to  the 
theory  of  SIRPs.  A  random  vector  Y  =  [Vi,  Yi ,  . . .  ,Yn]t  is  said  to  be  a  spherically  invariant 
random  vector  (SIRV)  if  its  PDF  has  the  form 

/Y(y)  =  *|£|-U„[(y  -  b)TS-'(y  -  b)]  (3.1) 

where  k  is  a  normalization  chosen  so  that  the  volume  under  the  curve  of  the  PDF  is  unity,  b  is  a  N 
by  1  vector,  E  is  a  N  by  N  non-negative  definite  matrix,  and  /ijv(-)  is  a  one  dimensional,  positive, 
real  valued  monotonically  decreasing  function.  Note  that  the  PDF  of  an  SIRV  is  elliptically 
symmetric  (i.e.,  constant  contours  of  /y(y)  are  composed  of  ellipses).  If  every  random  vector 
obtained  by  sampling  a  random  process  y{t)  is  a  spherically  invariant  random  vector,  regardless 
of  the  sampling  instants  or  the  number  of  samples,  then  the  process  y(t)  is  defined  to  be  a 
spherically  invariant  random  process  (SIRP). 

Kingman  [27]  introduced  the  definition  of  spherically  symmetric  random  vectors  (SSRV).  In 
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particular,  a  random  vector  X  =  [ATi ,  Xj,  ...  Xjv]r  is  said  to  be  spherically  symmetric  provided 
its  PDF  has  the  form 


/x(x)  =  +  x]  +  . . .  +  x2N)l]  =  khfir(xTx)  (3.2) 

where  hjv(.)  is  an  arbitrary,  non-negative,  monotonically  decreasing  radial  function  of  dimension 
N  and  A;  is  a  normalization  constant  chosen  so  that  the  volume  under  the  curve  of  the  PDF 
is  unity.  The  subscript  N  is  used  to  emphasize  that  we  are  dealing  with  N  random  variables. 
Throughout  the  manuscript,  it  is  assumed  that  the  PDF  of  a  random  vector  is  the  joint  PDF  of 
its  components.  Equivalently,  if  u  =  [u^,  u>2,  . . .  ,u>n]T,  the  characteristic  function  of  the  SSRV 
X  defined  by  $x(w)  =  ^[exp^u^x)],  has  the  form 

$x(“>)  =  0Ar[(u>jf  +  u/j  +  . . .  +  wjv)*]  (3.3) 

where  gs(.)  is  a  non-negative  conjugate  symmetric  function  which  is  magnitude  integrable.  An 
SSRV  is  a  special  case  of  an  SIRV,  arising  from  eq  (3.1)  when  b  =  0  and  E  =  I  where  I  is  the 
identity  matrix.  In  Appendix  A,  we  prove  that  the  characteristic  function  of  an  SSRV  is  also 
spherically  symmetric. 

3.3  Characterization  of  SIRPs 

In  this  section  we  present  some  important  theorems  that  help  us  to  characterize  the  PDF  of  a 
SIRV.  The  work  of  Yao  [28]  and  Kingman  [36]  gave  rise  to  a  representation  theorem  for  SSRVs. 
The  representation  theorem  can  be  stated  as  follows. 

Theorem  1  If  a  random  vector  X  =  [Xj,  X2 ,  . . .  Xjv]t  is  an  SSRV  for  any  N ,  then  there  exists 
a  non-negative  random  variable  T  such  that  the  random  variables  Xi,  (t  =  1,2,...  N)  conditioned 
onT  =  t  are  independent,  identically  distributed,  Gaussian  random  variables  with  zero  mean  and 
variance  equal  to  2 1. 

Proof:  By  definition,  the  characteristic  function  of  X  is 


$xM  =  E[exp(ju>TX)] 

=  f-oo  '-S-oo  exp(juTx)fx(x)dx. 


(3.4) 
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The  PDF  on  the  random  variable  T  is  introduced  by  noting  that 


/X(*)  =  r„/X, T(X,0<« 

(3.5) 

=  /-“  /x(t(x|<)/t(<)*. 

Substituting  into  the  expression  for  the  characteristic  function  and  interchanging  the  order  of 
integration  we  obtain 

$x(w)  =  /  $xir(w,  0/r(0»*  (3-6) 

y-oo 

where 

$X|r(u>,  t)=  f  ■■■[  e«p(ia;Tx)/x|r(x|0^.  (3.7) 

Since  X  is  an  SSR.V  for  any  TV,  its  characteristic  function  has  the  form  of  eq  (3.3).  This  requires 
that  $x|t(^) t)  also  be  a  function  of  (u>j  +  +  ...  +  <jJ%)  for  any  choice  of  TV.  The  only 

characteristic  function  having  this  property  [36]  is 

$X|<(<*>,  t )  =  exp[-t{u)\  +  u>l  +  . . .  +  u>^)]  (3.8) 

where  the  conditional  PDF  of  X,  given  T  —  t,  is  recognized  to  be  multivariate  Gaussian,  with 
Xi,  (i  =  1,  2,  . . . ,  TV)  being  statistically  independent  identically  distributed,  zero  mean  Gaussian 
random  variables  with  variance  2 1.  Because  the  variance  equals  2 1,  T  must  be  a  non-negative 
random  variable.  This  establishes  the  theorem.  Note  that  the  theorem  does  not  give  any  physical 
significance  for  T.  Neither  does  it  reveal  how  to  determine  /r(<). 

The  representation  theorem  for  SSRVs  allows  us  to  write  the  random  vector  X  as  a  product  of 
a  Gaussian  random  vector  Z  having  zero  mean  and  identity  covariance  matrix  and  a  non-negative 
random  variable  S  =  \/2T  with  PDF  /s(s).  In  particular,  consider  the  product  X  =  Z S.  S  is 
assumed  to  be  non-negative  for  convenience.  The  PDF  of  X  conditioned  on  S  is  then  given  by 

t 

/x|s(x|s)  =  (2ir)-%s-Nexp(~£--)  (3.9) 

where  p  =  xTx.  From  the  theorem  on  total  probability,  the  PDF  of  X  can  be  written  as 

/ 

/x(x)  =  {2n)-$  jQ  s-Nexp{—^)fs(s)ds.  (3.10) 
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Comparing  eqs  (3.10)  and  (3.2),  we  can  write  k  —  (2jt)“^  and 

MP )  =  l  3~NexP(-£2)fs(s)ds.  (3.11) 

Thus,  it  is  clear  that  the  PDF  of  an  SSRV  is  uniquely  determined  by  the  specification  of  a 
Gaussian  random  vector  having  zero  mean  and  identity  covariance  matrix  and  a  first  order  PDF 
fs{s)  called  the  characteristic  PDF. 

The  following  theorem  in  [37]  states  that  a  SIRV  is  related  to  an  SSRV  by  a  linear  transfor¬ 
mation. 

Theorem  2  If  X  is  an  SSRV,  with  characteristic  PDF  /$(*>),  then  the  deterministic  linear 
transformation 

Y  =  AX  +  b  (3.12) 

results  in  Y  being  an  SIRV  having  mean  vector  b,  covariance  matrix  E  =■■  AAT  and  the  same 
characteristic  PDF.  It  is  required  that  AAT  be  nonsingular. 

Proof:  Since  X  is  an  SSRV,  we  can  express  X  as  X  =  ZS,  where  Z  is  a  Gaussian  random  vector 
having  zero  mean  and  identity  covariance  matrix  and  5  is  a  non-negative  random  variable.  Hence, 

Y  =  AZ5  +  b.  (3.13) 

Conditioned  on  S ,  the  PDF  of  Y  is  Gaussian,  with  mean  vector  equal  to  b  and  covariance  matrix 
equal  to  AATs2.  The  PDF  of  Y  conditioned  on  S  is  given  by 

/Y|s(y|s)  =  (27r)-^|S|-*s-A/exp(-^j)  (3.14) 

where  p  =  (y  —  b)rE_1(y  —  b)  and  |E|  denotes  the  determinant  of  the  covariance  matrix  E  = 
AAt.  Using  the  theorem  on  total  probability,  the  PDF  of  Y  can  be  written  as 

/Y(y)  =  (2*)-?|£|-iMp)  (3.15) 

where 

hN(p)  —  jf  s~N exp{--~)fs{s)ds.  (3.16) 
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The  PDF  of  Y  is  of  the  form  of  eq  (3.1).  Therefore,  Y  is  an  SIRV.  The  PDF  of  an  SIRV  is  uniquely 
determined  by  the  specification  of  a  mean  vector,  a  covariance  matrix  and  a  first  order  PDF  called 
the  characteristic  PDF.  Theorem  1  for  SSRVs  generalizes  for  SIRVs  in  a  straightforward  manner. 
The  only  difference  is  that  conditioned  on  the  non-negative  random  variable  T,  the  {V*  :  (k  = 
1,  2,  ...  N)}  are  no  longer  statistically  independent.  Instead,  the  PDF  of  Y  conditioned  on  T  is 
a  multivariate  Gaussian  PDF.  By  the  same  argument  used  for  SSRVs,  an  SIRV  can  be  written  as 
a  product  of  a  Gaussian  random  vector  and  a  non-negative  random  variable.  The  only  difference 
is  that  the  mean  of  the  Gaussian  random  vector  need  not  be  zero  and  its  covariance  matrix  is  not 
the  identity  matrix.  As  a  corollary  of  Theorem  2  [28],  it  can  be  readily  shown  that  every  linear 
transformation  on  an  SIRV  results  in  another  SIRV  having  the  same  characteristic  PDF,  As  a 
special  case,  when  fs(&)  =  S(s  —  1)  where  £(.)  is  the  unit  impulse  function,  h^(p)  =  exp(— |)  and 
the  corresponding  SIRV  PDF  given  by  eq  (3.15)  is  the  multivariate  Gaussian  PDF.  Therefore, 
the  multivariate  Gaussian  PDF  is  a  special  case  of  the  SIRV  PDF. 

The  following  theorem  from  [29]  provides  an  interesting  property  of  SSRVs  when  represented 
in  generalized  spherical  co-ordinates  R  £  (0,  oo),  0  €  (0, 2vr)  and  €  (0, 7r),  (k  ~  1 , . . .  N  —  2). 

Theorem  3  When  the  components  of  the  random  vector  X  =  [X\ . . .  XnY  are  represented  in 
the  generalized  spherical  coordinates  given  by 


=  R  cos($1) 

Xk  =  Rcos($*)  F&j  sin ($,)  (1  <  k  <  N  -  2) 


(3.17) 


Xn-\  =  Rcos(0)  n<Il2  sin($,-) 

XN  ~  Rsin(0)riilT2sin(4*,), 

X  is  an  SSR  V  if  and  only  if  R,  0  and  are  mutually  and  statistically  independent  random 
variables  having  PDFs  of  the  form 


fMr)  =  ^7f^Mr>(»') 

MM  =  -  ir)J  <318> 

/eW  =  (27r)-1[u(0)  -  u(0  -  2;r )] 
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where  T(.)  is  the  Eulcro  Gamma  function  and  u(.)  is  the  unit  step  function, 

Proof:  Since  the  random  vector  X  is  an  SSRV,  its  PDF  is  of  the  form  of  eq  (3.2)  with  hs(p  ) 
being  given  by  eq  (3.11).  The  Jacobian  of  the  transformation  given  by  eq  (3.17)  is  obtained  in 
[38]  as 

J  =  (i?*-1  n  sinN-l-k(<l>k))-1.  (3.19) 

*= i 

Using  eq  (3.2)  and  eq  (3.19)  and  noting  that  R 2  —  -X*,  the  joint  PDF  of  R,  0  and  (k  = 

1, 2, ...  N  —  2)  becomes 

/ft,e,*j...ow_3(r, -  —  w-fyv(r2)  JJ  ainN~l~k(<l>k)  (3.20) 

(2w)T  jt=i 

Since  the  joint  PDF  in  eq  (3.20),  can  be  written  as  a  product  of  the  marginal  PDFs  given 
in  eq  (3.18),  the  variables  R ,  0  and  $*,  are  mutually  and  statistically  independent  with  the 
prescribed  PDFs  .  In  order  to  prove  the  sufficient  part  of  the  property,  we  start  with  the  marginal 
PDFs  of  R,  ©  and  given  by  eq  (3.18)  and,  under  the  assumption  of  statistical  independence, 
obtain  the  joint  PDF  of  eq  (3.20).  Using  the  inverse  Jacobian  of  that  given  by  eq  (3.19),  results 
in  the  PDF  of  X  being  given  by  eq  (3.2). 

3.4  Determining  the  PDF  of  an  SIRV 

In  this  section  we  shall  present  schemes  for  determining  the  PDF  of  an  SIRV.  We  recognize  that 
the  PDF  of  an  SIRV  is  uniquely  determined  by  the  specification  of  a  mean  vector,  a  covariance 
matrix  and  a  characteristic  first  order  PDF  and  that  the  SIRV  PDF  has  the  form  of  eq  (3.15). 
Several  techniques  are  available  in  the  literature  for  specifying  hjy(p).  The  simplest  technique 
is  to  use  eq  (3.16).  However,  this  procedure  requires  the  knowledge  of  the  characteristic  PDF 
fs(s).  Therefore,  when  fs(s)  is  not  known  in  closed  form  or  it  is  difficult  to  evaluate  the  integral 
in  eq  (3.16),  alternate  methods  for  specifying  hfii(p)  must  be  examined. 

To  study  the  behavior  of  Ayv(p),  it  is  convenient  to  replace  p,  which  is  a  quadratic  form 
depending  on  N ,  by  the  dummy  scalar  variable  w.  We  then  write 

hN(w)  =  Jq  s~Nexp(-~)fs(s)ds.  (3.21) 
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When  both  sides  of  eq  (3.21)  are  differentiated  with  respect  to  tv ,  we  obtain 

dhN(w)  1 


dw 


=  -i  r 

2  Jo 


--N-2 


6xP(—^)fs(s)ds. 


(3.22) 


The  right  hand  side  of  eq  (3.22)  is  related  to  hN+2(w)  by  the  factor  of  -i.  Thus,  we  have  an 
interesting  result  pointed  out  in  [32]  that 


hjv+u(t/>)  =  (— 2)~~~.  (3.23) 

Because 

/Y(y)  =  (27r)"^|E|-hw+3(p)  (3,24) 

when  Y  is  of  dimension  N  -\-2t  it  follows  that  h]sf(w)  must  be  a  monotonically  decreasing  function 
for  all  N.  Eq  (3.23)  provides  a  mechanism  for  relating  higher  order  PDFs  with  those  of  lower 
order.  More  precisely,  starting  with  N  =  1  and  N  =  2,  and  using  eq  (3.23)  repeatedly,  gives  the 
following  pair  of  recurrence  relations. 


W(«0  = 

(3.25) 

WM  =  (-2 

Therefore,  starting  from  hi(w)  and  h2(w)  all  PDFs  of  odd  and  even  order,  respectively,  can  be 
generated  by  the  use  of  eq  (3.25).  However,  since  hN(.)  is  defined  to  be  a  non-negative  mono¬ 
tonically  decreasing  function  for  all  N,  &i(.)  and  h2(.)  must  belong  to  a  class  of  functions  that 
are  positive  and  monotonically  decreasing.  Consequently,  their  successive  derivatives  will  alter¬ 
nate  between  negative  and  positive  functions  that  are  monotonically  increasing  and  decreasing, 
respectively.  Given  hN(w),  the  Nth  order  SIRV  PDF  is  given  by 

fy(y)  =  {2n)~^\^\~HN(p)  (3.26) 

where  hyv(p)  is  nothing  more  than  h\(w)  with  w  replaced  by  p. 

Another  approach  for  specifying  hN(p)  that  begins  with  the  univariate  characteristic  function 
has  been  proposed  in  [39,  28,  29].  It  is  required  that  the  univariate  characteristic  function  be  a 
real  even  function  whose  magnitude  is  integrable.  Also,  it  is  assumed  that  the  components  of 
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the  SIRV  axe  identically  distributed.  Under  these  conditions,  it  has  been  shown  that 

^n(p)  =  (\Zp)1-^  Jq  4>{u)J (3.27) 

where  <f>(u)  is  the  univariate  characteristic  function  and  Ja(rj)  is  the  Bessel  function  of  order  a. 
Eq  (3.27)  has  an  elegant  proof  by  induction  which  is  presented  here.  Prom  eq  (3.15)  it  follows 
that  hi(p)  is  related  to  the  first  order  SIRV  PDF  of  the  ith  component.  More  explicitly,  we  can 
write 

frt(Vi)  -  (\/2jro’)~1*i(p.)  (*'  =  1,  2, . . .  N)  (3.28) 

where  p,  =  ^  and  a1  is  the  common  variance  of  the  random  variables  Y{  (*  =  1,  2,  . . .  N).  For 
convenience,  assume  that  <r2  is  unity.  The  univariace  characteristic  function  is  then  given  by 

M“)=  f  fYi(yi)exp(juyi)dyi.  (3.29) 

J  -oo 

Using  the  inverse  Fourier  transform  and  noting  that  y,  =  y/pi,  h  i  (p,- )  can  be  expressed  in  terms 
of  the  characteristic  function  as 

1  f°° 

hi  (pi)  =  —7==  /  <f>i(u})exp{-ju>y/p)du.  (3.30) 

v  Z7T  J— oo 

Since  is  the  same  for  all  i ,  the  subscript  t  in  eq  (3.30)  can  be  dropped.  In  addition,  because 
<^(w)  is  an  even  function,  we  can  rewrite  eq  (3.30)  as 

Mp)  =  \  ~  [  <f>(u>)cos(u>y/p)du.  (3.31) 

V  7T  Jo 

Recognizing  that  cos(x)  =  \f^J~ i(x),  and  replacing  p  by  the  dummy  variable  w,  we  have 

hi(w)  =  (t/w)*  f  w*<f>(w)J_i(wy/w)du.  (3.32) 

Jo  2 

Since  the  derivation  makes  use  of  eq  (3.23)  it  is  necessary  to  consider  odd  and  even  values  of  N 
separately.  For  odd  values  of  N ,  eq  (3.27)  can  be  written  as 

h'iN-liw)  =  (y/w)l~N  f  U)N ~  1  Jaw-3  (u )y/w)du).  (3.33) 

J  0  2 

Equation  (3.33)  is  now  shown  to  hold  for  ail  N  by  means  of  induction.  With  N  =  1,  eq  (3.33) 


36 


reduces  to  eq  (3.32).  It  remains  to  show  that  eq  (3.33)  is  valid  when  N  is  replaced  by  N  +  1. 
Differentiating  both  sides  of  eq  (3.33)  with  respect  to  tv,  we  obtain 

~k~~dw~"~  =  Jo  (3-34) 

First,  focus  on  the  term  jj[(y/u?)^~'v  Jaw-a  (u\  fw)\.  Since  this  involves  the  derivative  of  a  product, 
we  can  write 


^[(v^)*  =  ^("  -  N){y/w)l~N Ju^(uy/w)  +  (v/w)^~^-^[J2^=i(wv/u;). 


Using  the  identity  [40] 


=  ~JM  -  •/„„(,) 
dr)  i? 


(3.35) 

(3.36) 


we  have 


w)]  =  ~{\/w)  ~  J^iuy/w)].  (3.37) 


Substituting  eq  (3.37)  in  eq  (3.35)  gives 


— [(v/t?)a  N  JiN-i  (lV\/w)]  =  -^(Vw)^~NJ2£Li liyjy/w). 


(3.38) 


Consequently,  eq  (3.34)  reduces  to 

dh2N-i(tv)  1 


dw 


=  -n(V^)*~A'  /°°o;;v+M(w)J2N-,(u;v^yw.  (3.39) 

2  Jo  * 


However,  from  eq  (3.23)  we  know  that  h2N+i(rv)  =  (—2)— Hence,  we  have  from  eq  (3.39) 


^2AT+l(w)  =  (>/5)*  N  f  U>N+*<j>(w)j2N-l  (u>yAv)<Lu.  (3.40) 

Jo  5 

Because  eq  (3.40)  is  identical  to  eq  (3.33)  with  N  replaced  by  N  +  1,  it  has  been  shown  by 
induction  that  eq  (3.33)  is  valid  for  all  N.  It  follows  that  eq  (3.27)is  valid  for  all  odd  values  of 
N. 

In  a  similar  manner,  starting  with  h2(p),  it  can  be  shown  that 

hiN+2(p)  =  \/p~N  f  ivN+l<j>(tv)JN(u>s/p)dL>  (3.41) 

Jo 

for  all  N.  Note  that  eq  (3.41)  is  identical  to  eq  (3.27)  with  N  replaced  by  2N  -f  2.  The  proof  of 
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this  result  is  presented  in  Chapter  3.  Thus,  in  general,  for  any  N  (odd  or  even),  we  can  write 
hjv(p)  as  in  eq  (3.27). 

3.5  Properties  of  SIRVs 

In  this  section  we  present  certain  important  properties  of  SIRVs. 

3.5.1  PDF  Characterization 

The  multivariate  PDF  of  an  SIRV  as  given  by  eqs.  (3.15)  and  (3.16)  is  uniquely  determined 
by  the  specification  of  a  mean  vector  b,  a  covariance  matrix  E  and  a  characteristic  first  order 
PDF  fs(s).  The  PDF  involves  a  non-negative,  real  valued  monotonically  decreasing  function 
hs(‘)  of  a  non-negative  quadratic  form.  The  type  of  SIRV  is  determined  by  the  form  of  ks(-) 
or,  equivalently,  the  choice  of  /s(s).  Higher  order  PDFs  can  be  obtained  by  the  use  of  eq  (3.27) 
whereas  lower  order  PDFs  can  be  obtained  in  the  usual  manner  by  integrating  out  the  unwanted 
variables.  We  discuss  this  procedure  in  Appendix  A.  The  PDFs  of  all  orders  are  of  the  same 

type.  The  marginal  PDFs  are  used  to  classify  the  type  of  SIRV. 

3.5.2  Closure  Under  Linear  Transformation 

As  shown  in  Theorem  2  of  Section  2.3,  every  linear  transformation  of  the  form  of  eq  (3.12)  on 
an  SIRV  results  in  another  SIRV  having  the  same  characteristic  PDF.  This  feature  is  called  the 

closure  property  of  SIRVs  (28,  29). 

3.5.3  Minimum  Mean  Square  Error  Estimation 

In  minimum  mean  square  error  estimation  (MMSE)  problems,  given  a  set  of  data,  SIRVs  are 
found  to  result  in  linear  estimators  [39,  28,  41].  An  interesting  proof  of  this  property  is  presented 
here.  Let  Y  =  [YiT  Y2T}T  where  Yi  =  [Yu  Y2,  . . .  Ym]T  and  Y2  =  [Ym+ 1 ,  Vm+2,  . . .  Y„)T  denote 
the  partitions  of  Y.  It  has  been  pointed  out  in  [42]  that  the  minimum  mean  square  error  estimate 
of  the  random  vector  Y2  given  the  observations  from  the  random  vector  Yi,  is  given  by 

Y2  =  E[Y2|Y!]  (3.42) 

where  i?[Y2|Yi]  denotes  the  conditional  mean  or  the  expected  value  of  Y2  given  Yj.  Assume 
that  Y  is  an  SIRV  of  dimension  N  with  characteristic  PDF  fs(s).  Also,  for  convenience,  it  is 
assumed  that  the  mean  of  Y  is  zero.  The  covariance  matrix  of  Y  denoted  by  E  can  be  partitioned 
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53  = 


(3.43) 


as 

r 

Cn  Cj2 

C21  C22 

where  Cn  denotes  the  covariance  matrix  of  Yi,  C12  denotes  the  cross  covariance  matrix  of  the 
vectors  Yi  and  Y2,  C21  is  the  transpose  of  C12,  and  C22  denotes  the  covariance  matrix  of  the 
vector  Y2.  The  PDF  of  Y2  given  Yi  is  expressed  as 


/W^ly.)  =  jggjj  • 


(3.44) 


Recall  from  eqs.  (3.15)  and  (3.16)  that 


My)  =  (2*)-*|53|-Hw(p) 


(3.45) 


where 


Mp)  -  s~Nexp(--~)fs(s)ds  (3.46) 

and,  assuming  b  =■  0  p  =  y^£-1y-  Note  that  the  inverse  covariance  matrix  can  be  partitioned 
as  [38] 


S"1  = 


A  B 
C  D 


(3.47) 


where 


A  =  (Cii-Ci2C221C2i)-1 
B  =  AC12C22 

C  =  -DC21C11 
D  =  (C22  -  C2iCji  C12)-1. 

Expanding  the  quadratic  form,  we  have 


(3.48) 


P  =  yf  Ayi  +  y,  Byj  +  yj  Cy,  +  yj Dy2. 


(3.49) 
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Adding  and  subtracting  y?Crfyi  to  the  right  hand  side  of  eq  (3.49)  gives 


P  =  yf  (A  -  Crf)yi  +  yfCjJyj  +  y?Bya  4-  y2  Cy*  +  y2  Dya.  (3.50) 

Note  that 

A -Crf  «  -BC21Crf.  (3.51) 

Hence, 

P  -  yfcrfyi  -  y? BC2iCrfyi  +  y?Bya  +  y?  Cyi  +  y?Dy2.  (3.52) 

However,  it  can  be  shown  that 

y2 Cyi  -=  -yj DCaiCrfyi 

y?By2  =  -y?CrfC12Dy2  (3-53) 

-y?BC2i€rfyi  -  yfcrfCi2DC21Crfyi 
Making  these  substitutions  in  the  expression  for  p ,  it  follows  that 

P  ~  y?Crfyj  -f  y2  Dy2  — y2DC2iCrfyi  — yJcrfCi2Dy2  +  yJcrfCi2DC21Crfyi.  (3.54) 
This  can  be  rewritten  as 

P  =  yf  Crfyi  +  (y2  -  C21Crfyi)rD(y2  -  C2iCrfyi)  (3.55) 

For  simplicity,  we  define 


Then 


Pi  =  yf  Crfyi 

P2  =  (ya  -  C2iCrfy!)TD(y2  -  C21Crfyi). 


(3.56) 


P  =  Pi  -1-  P2- 


(3.57) 


40 


Prom  eqs  (3.57)  and  (3.44)-(3.46),  we  have 


~  /Yi(yi)  J0  s  NexP(  P  2s'*P2^s(S^3‘  (3.58) 

where  k  =  (27r)-^|E|“&.  Next,  consider 


SOW)  =  ra~NeX*~&  JY,  *•  (3.59) 


Noting  that 

/yj  =  (2>r)  V!|D|-UA'-m(C2iCrfyi), 

(3.60) 

gives 

E(Y2|Yl) = 7^k)l 

(3.61) 

where  ki  ~  (27r)" 

?|E|“a |D|-a[C2iCi^yi].  When  a  matrix  is  partitioned  as  i 

a  eq  (3.47),  it  is 

known  that  [43] 

|S|  =  |Cn||C22-C2iCrfCi2|. 

(3.62) 

Since 

D  =  (C22  -  C2iCrfCi2)-1, 

(3.63j 

it  follows  that 

|E|  =  ICullD-1! 

(3.64) 

Thus, 

|S-‘|  =  ICul^lDI. 

(3.65) 

Hence,  k\  —  (27r)~ 

^!Clij"f[C2iCiiyi].  Finally,  since 

/Y,(yi)  =  (2i)-?|C„|-i  l°° s~mexp(-^)fs(s)ds, 

(3.66) 

Y,  =  B(Yj|Yi)  =  [C2iCrfyi|. 

(3.67) 

It  is  seen  that  the  MMSE  estimate  of  Y2  given  the  data  Yi  is  a  linear  function  of  Yj. 

If  the  random  vectors  Yi  and  Y2  have  non-zero  means  denoted  by  bi  and  b2  respectively, 
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then  eq  (3.67)  takes  the  form 

£(Y*|Yi)  =  b2  +  CaiCrf(yi  -  bi).  (3.68) 

As  a  consequence  of  this  property,  when  the  random  vectors  Yi  and  Yj  are  uncorrelated  so 
that  Cai  sss  0,  then  ve  have 

^[YalYx]  -  b2  =  E{ Yj].  (3.69) 

This  property  is  referred  to  as  semi  independence  in  [39,  44,  28].  However,  for  all  SIRVs  except 
the  Gaussian,  this  result  does  not  imply  that 

/v2iY1(y3|yi)  =  /Y3(y2)  (3J0) 

This  emphasizes  the  property  that  although  uncorrelatednejs  guarantees  statistical  independence 
for  Gaussian  random  vectors,  it  is  not  a  general  property  of  SIRVs. 

3.5.4  Distribution  of  Sums  of  SIRVs 

While  it  is  true  that  the  sum  of  two  jointly  Gaussian  random  vectors  is  also  Gaussian,  the 
same  is  not  true  for  SIRVs  in  general.  This  result  holds  for  twc  SIRVs  when  they  are  sta¬ 
tistically  independent,  have  zero  mean  and  when  the  covariance  matrix  of  the  first  is  within 
a  multiplicative  constant  of  the  covariance  matrix  of  the  second  [28,  29].  More  precisely,  let 
Yx  =-  [Y„,  r,2,  . . .  Y1N)T  and  Y2  =  [V21,  >22,  . . .  >2^]r  denote  two  independent  zero  mean  SIRVs. 
The  covariance  matrix  and  characteristic  PDF  of  Yx  are  denoted  by  Ex  and  /s,(si).  The  cor¬ 
responding  quantities  for  Y2  are  denoted  by  E2  and  /s2  (sa)-  We  are  interested  in  obtaining  the 
distribution  of  the  sum  given  by 

Y  =  Yx  +  Y2.  (3.71) 

The  characteristic  function  of  Y  is  given  by 

E[exp(jvY)]  =  flfi(u;TEicn)5r2(wTE2w)  (3.72) 

where  #i(.)  and  p2(.)  are  the  characteristic  functions  of  Yj  and  Y2,  respectively.  If  Y  is  a  zero 
mean  SIRV,  then  its  characteristic  function  has  the  form 

E[exp{juY))  =  fir(u>TEu;).  (3.73) 
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In  order  to  write  eq  (3.72)  as  a  function  of  a  single  quadratic  form,  £3  must  be  within  a  multi¬ 
plicative  constant  of  £1 . 

3.5.5  Markov  Property  for  SIRPs 

An  interesting  property  of  SIRPs  is  that  a  zero  mean  wide  sense  stationary  SIRP  is  Markov  if 
and  only  if  its  autocorrelation  function  has  the  form 

R(t  1,  k)  =  exp(— a|(*i  -  ta|).  (3.74) 

This  result  is  well  known  for  the  special  case  of  a  zero  mean  wide  sense  stationary  Gaussian 
random  process.  To  demonstrate  the  more  general  result  we  consider  N  samples  from  a  zero 
mean  wide  sense  stationary  SIRP  y(<).  Let  Y  =  (Tj,  Y%  .. . ,  Yn]t  denote  the  vector  of  successive 
samples  obtained  from  the  SIRP. 

Given  that  y(t)  is  a  zero  mean  wide  sense  stationary  Markov  SIRP,  we  first  show  that  its 
autocorrelation  function  must  have  the  form  of  eq  (3. 74). Let  Vj,  V2  and  Y3  denote  the  random 
variables  obtained  by  sampling  y(t)  at  time  instants  f  1 ,  t2  and  £3  such  that  ti  <  t2  <  £3.  Since 
y(t)  is  a  Markov  process,  the  joint  PDF  of  Ti,  Y2  and  Y3  can  be  expressed  as 

/n  .Ka.vi  (s/i»  s/2»  2/3)  =  fYi(yi)fy,\Yt(y2\yi)fY3\Yj(y3\y‘2)-  (3.75) 

The  autocorrelation  function  R(t3,ti)  =  E\Y3Y\]  is  given  by 

/oo  yoo  roc 

/  /  y3yifYuY3,Y3(yuy2,y3)dyidy2dy3.  (3.76) 

-00  J — 00  J  —00 

Also, 

R(t2,  h)  =  E[Yf\  =  r  vl/r,  (yi)d,2.  (3.77) 

J  —  oo 

Hence, 

/oo  yoo  yoo  roo 

/  /  /  y3y\fYuYi,Y3{yi,y2,y3)dy\dy2dy3yl}Y2{y2)dy2.  (3.78) 

*oo  J  —  oo  J — oo  J  —  oo 

Using  eq  (3.75)  we  can  rewrite  the  above  equation  as 

/oo  yoo  roc  roo 

/  ymfY3,Y2{y3,y2)dyzdy2  /  /  y2yifY3,Y1(y^yi)dy2dyi.  (3.79) 

-oo  J  —  oo  J— co  J — oo 
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Consequently, 

R(t3,  ti)R(t2)  t3)  =  /?(<3,  t2)R(t2,  <i).  (3.80) 

The  only  non-trivial  autocorrelation  function  satisfying  this  property  is  given  by  eq  (3.74). 

Since  y(t)  is  a  zero  mean  SIRP,  it  follows  that  J£[Y]  =  0.  Letting  b  —  exp(-a),  we  can  write 
the  covariance  matrix  of  Y  as 


1  6 
b  1 


...  bN ~l 
...  bN~ 2 
...  bN~3 


bN~ 1  bN~2  ...  1 


We  then  make  use  of  eq  (3.68)  to  obtain 


(3.81) 


E[Yn\YN-u  Yn- 2  . . . ,  Y»]  =  [bN~'  bN~ 2 . . .  ftJEy^Y' 


where  Y'  =  [Vi,  Y2,  Yn~i]t  and 


1  6  ...  bN~ 2 

b  1  ...  bN~3 


bN-2  bN-3  t  ! 


(3.82) 


(3.83) 
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Recognizing  that 


1  -b  0 .  0 

—b  1  +  63  -b  0  ...  0 


1 

I  -  63 


0  -6 


1  +  63 


0 .  -b  1  +  63  -b 

0 .  -6  1 


Therefore,  we  can  rewrite  eq  (3.82)  as 


E[Yn\YN-U  Yn_2  . . . ,  Yx]  =  bYs-i. 


(3.84) 


(3.85) 


From  eq  (3.68),  we  also  obtain 

=  bYN-i.  (3.86) 

Clearly  =  ^[V/vIVat-i,  Yn-2  .  ..,Vi].  Since  this  must  be  true  for  all  choices  of 

Yi,  Y2,  ...,yjv_i,  it  follows  that  fys\YN_uYN_3 Vn- i  .  ..,J/i)  =  SVH\VN_X (ywly^-i). 
Hence,  y(t)  is  Markov. 

3.5.6  Kalman  Filter  for  SIRPs 

It  has  been  shown  by  Chu  in  [41]  that  the  Kalman  filter  for  SIRPs  is  identical  to  the  corre¬ 
sponding  filter  for  a  Gaussian  random  process.  The  model  considered  in  [41]  is  given  by 


*k+i  =  FkXi  +  Gkwk  (k  =  0,  1,  . . . ,  N  -  1) 
yk  =  Hkxit  +  vk  (k  - ••  0,  1,  . . . ,  N  -  1) 


(3.87) 


where  xk  denotes  the  state  vector  of  the  underlying  process,  wk  is  its  excitation  vector,  yk 
denotes  the  observation  vector  and  vk  is  the  measurement  noise.  It  is  assumed  that  xk,  wk  and 
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vk  are  jointly  SIRP  with  a  common  characteristic  PDF  fs( s).  Also,  let 


^[xk]  —  5fjc  —  0,  1, .  • . ,  N  X) 

E[(x  k-5Ek)(xk-5Ek)T]  =  Mk 
^[wk]  =  ^(vk)  =  0 

(3.88) 

£[(xk  -  3Ek)wkT]  =  E[(xk  -  Xk)vkr]  =  £[wkvj  ]  =  0 

^[widWjcn]  =  Qk<5,,m 

^[vidVion]  =  Rk£/,m 

where  wirm  and  virm  are  the  mth  components  of  wk  and  vk  respectively,  and  6i<m  is  the  Kronecker 
delta  function.  Hence,  xk,  wk  and  vk  are  mutually  uncorrelated  while  wk  and  vk  are  each  white 
with  zero  mean. 

The  innovations  vector  is  defined  as 

ftc|k-i  =  yk  ~  Hkxk|k_i  (3.89) 

where  xk|k_i  is  the  MMSE  estimate  of  xk  given  the  observation  vectors  up  to  k  —  1.  The 
covariance  matrix  of  the  innovations  can  be  shown  to  be 

Cou(yk|k_i)  =  Sk|k_i  =  (HkMkHk  +  Rk)-  (3.90) 

It  can  be  readily  shown  that  xk  and  yk  are  jointly  SIRP.  Therefore,  the  MMSE  estimate  of  xk 
given  the  observation  vectors  up  to  k  —  1  is  a  linear  function  of  ym  m  =  1,  2,  ...,&  —  1,  as 
shown  by  eq  (3.68).  Hence,  the  Kalman  filter  equations  for  SIRPs  are  identical  to  those  for  the 
Gaussian  case.  The  Kalman  gain  denoted  by  Kk|k  is  expressed  as 

Kk|k  =  MkHfSk|'k_,.  (3.91) 

The  measurement  update  xk|k  is  given  by 

xk|k  =  xk|k_i  +  Kk|kyk|k_i  =  (I  -  Kk|k)xk|k_i  +  Kk|ky*.  (3.92) 
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The  covariance  matrix  of  the  error  in  the  update  can  be  written  as 

C»  =  Mk  -  M„H£ (HkMkHf  +  Rk)-*HkMk.  (3.93) 

The  prediction  is  then  given  by 

*k+l|k  =  *k*k|k-  (3.94) 

Finally,  the  covariance  matrix  of  the  prediction  is  expressed  as 

Mk+i  =  FkCkFk  +  GkQkGk.  (3.95) 

When  systems  driven  by  non-Gausrian  noise  are  encountered  in  practice,  under  the  assumption 

of  joint  SIRPs,  these  equations  provide  an  efficient  computation  formula  for  the  Kalman  filter. 

3.5.7  Statistical  Independence 

We  point  out  that  the  only  case  for  which  the  components  of  an  SSRV  are  statistically  inde¬ 
pendent  occurs  when  the  SSRV  is  Gaussian.  This  property  is  proved  in  Appendix  A. 

3.5.8  Ergodicity  of  SIRPs 

It  has  been  pointed  out  in  [39]  that  an  ergodic  SIRP  is  necessarily  Gaussian.  The  proof  of 
the  non-ergodicity  of  SIRPs  (except  Gaussian)  can  be  easily  obtained  using  the  representation 
theorem  [28]  for  SIRPs  which  states  that  an  SIRP  is  a  univariate  randomization  of  the  Gaussian 
random  process.  More  precisely,  if  y(t)  is  an  SIRP,  then  it  can  be  expressed  as  y(t)  =  Sz(t ), 
where  S  is  a  non-negative  random  varuible  and  z(t)  is  a  Gaussian  random  process.  Clearly,  if 
z{t)  is  stationary,  then  y(t)  will  also  be  stationary.  However,  different  realizations  of  S  result 
in  different  scale  factors  for  the  sample  functions  of  y(t).  Therefore,  time  averages  will  differ 
from  one  sample  function  to  another  and,  in  general,  will  not  equal  the  corresponding  ensemble 
average.  Consequently,  y(t)  cannot  be  ergodic.  When  S  is  a  non-random  constant,  y(t)  is  a 
Gaussian  random  process.  Then  y(t)  will  be  ergodic  provided  z(t)  is  also  ergodic.  It  is  concluded 
that  only  Gaussian  SIRPs  can  be  ergodic. 

3.6  Conclusion 

In  this  chapter,  we  have  presented  an  overview  of  the  literature  on  both  the  modeling  of  radar 
clutter  and  the  theory  of  SIRPs.  It  is  clear  from  this  chapter  that  the  PDF  of  an  SIRV  is  uniquely 
determined  by  the  specification  of  a  mean  vector,  a  covariance  matrix  and  a  characteristic  first 
order  PDF.  It  is  also  seen  that  many  interesting  properties  of  Gaussian  random  processes  extend 
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readily  to  SIRPl.  A  major  difference  with  non-Gaussian  SIRPs  is  their  non-ergodic  behavior. 
Consequently,  time  averages  do  not  result  in  corresponding  ensemble  averages.  However,  if 
ensemble  averages  arc  used  instead  of  time  averages,  then  non-ergodicity  is  not  a  serious  problem. 
In  the  following  chapters,  we  shall  present  the  application  of  SIRPs  for  non-Gaussian  radar  clutter 
modeling,  simulation  and  distribution  identification. 


48 


Chapter  4 

Radar  Clutter  Modeling  Using 
Spherically  Invariant  Random 
Processes 


4.1  Introduction 

In  this  chapter  we  consider  the  use  of  the  theory  of  spherically  invariant  random  processes 
(SIRP)  for  modeling  correlated  non-Gaussian  radar  clutter.  It  has  been  pointed  out  in  chapter  2 
that  radar  clutter  can  be  non-Gaussian  and  that  radars  process  N  pulses  at  a  time.  Furthermore, 
the  clutter  can  be  highly  correlated.  Therefore,  by  clutter  modeling  we  mean  the  specification 
of  the  joint  probability  density  function  (PDF)  of  the  N  correlated  clutter  samples.  Since  we 
are  dealing  with  correlated  clutter,  the  joint  PDF  cannot  be  constructed  by  simply  taking  the 
product  of  the  marginal  PDFs.  This  chapter  presents  a  mathematically  elegant  and  tractable 
approach  for  specifying  the  joint  PDF  of  N  clutter  samples.  In  addition,  we  discuss  the  char¬ 
acterization  of  Gaussian  and  non-Gaussian  correlated  random  vectors,  the  need  for  a  library  of 
multivariate  PDFs  for  modeling  correlated  non-Gaussian  clutter,  several  techniques  for  estab¬ 
lishing  this  library  and,  finally,  a  key  result  for  the  distribution  identification  of  multivariate 
correlated  non-Gaussian  random  vectors. 

Specifically,  the  problem  of  modeling  a  random  vector  obtained  by  sampling  a  stochastic 
process  y(t )  at  N  time  instants  is  of  interest  to  us.  The  stochastic  process  may  be  real  or 
complex.  In  addition,  there  is  no  restriction  on  the  number  of  samples  obtained  or  the  sampling 
time  instants.  In  order  to  completely  characterize  the  random  vector  we  need  to  specify  the  joint 
probability  density  function  of  the  N  samples  (real  or  complex)  or,  equivalently,  specify  the  joint 
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characteristic  function.  This  problem  is  very  well  treated  when  the  underlying  stochastic  process 
is  Gaussian.  The  joint  PDF  in  this  case  can  be  written  as  (2jr)-^|E|_iexp(— §),  where  p  is  a 
non-negative  quadratic  form  given  by  p  —  [y  -  /i]TE-l[y  -  /i].  Here  p  and  £  denote  the  mean 
vector  and  covariance  matrix  of  the  Gaussian  random  vector  Y  whose  components  are  the  N 
samples  of  y(f ).  However,  if  y(t)  is  not  a  Gaussian  random  process,  there  is  no  unique  specification 
for  the  joint  PDF  of  the  N  samples  except  when  the  samples  are  statistically  independent. 

When  processing  real  world  data,  neither  the  Gaussianity  of  the  underlying  stochastic  process 
nor  the  statistical  independence  of  the  samples  is  guaranteed.  In  fact,  it  is  likely  that  the  samples 
may  be  correlated.  Hence,  we  need  to  obtain  multivariate  non-Gaussian  PDFs  which  can  model 
the  correlation  between  samples.  In  practice,  radar  clutter  can  vary  from  one  application  to 
another.  Therefore,  we  need  to  have  available  a  library  of  possible  multivariate  non-Gaussian 
PDFs  so  that  an  appropriate  PDF  can  be  chosen  to  approximate  the  data  for  each  clutter 
scenario. 

The  theory  of  Spherically  Invariant  Random  Processes  (SIRP)  provides  us  with  elegant 
and  mathematically  tractable  techniques  to  construct  multivariate  non-Gaussian  PDF's.  Spher¬ 
ically  invariant  random  processes  are  generalizations  of  the  familiar  Gaussian  random  process. 
The  PDF  of  every  random  vector  obtained  by  sampling  a  SIRP  is  uniquely  determined  by  the 
specification  of  a  mean  vector,  a  covariance  matrix  and  a  characteristic  first  order  PDF.  In  addi¬ 
tion,  the  PDF  of  a  random  vector  obtained  by  sampling  a  SIRP  is  a  function  of  a  non-negative 
quadratic  form.  However,  the  PDF  does  not  necessarily  involve  an  exponential  dependence  on 
the  quadratic  form,  as  in  the  Gaussian  case.  Such  a  random  vector  is  called  a  Spherically 
Invariant  Random  Vector  (SIRV). 

There  are  two  kinds  of  models  for  non-Gaussian  radar  clutter.  One  is  called  the  endogenous 
model,  where  the  desired  non-Gaussian  process  with  prescribed  envelope  PDF  and  correlation 
function  is  realized  by  using  a  zero  memory  non-  linear  transformation  on  a  Gaussian  process 
having  a  prespecified  correlation  function.  In  this  approach  it  is  not  possible  to  independently 
control  the  envelope  PDF  and  the  correlation  properties  of  the  non-Gaussian  process.  In  addition, 
not  all  nonlinearities  give  rise  to  a  non-negative  definite  covariance  matrix  at  their  outputs.  The 
second  model  is  called  an  exogenous  product  model  [26].  In  this  model,  the  desired  non-Gaussian 
clutter  is  generated  by  the  product  of  a  Gaussian  random  process  and  an  independent  non- 
Gaussian  process  which  can  be  highly  correlated.  In  this  scheme,  the  desired  envelope  PDF  and 


the  correlation  properties  can  be  controlled  independently.  The  exogenous  model  can  be  thought 
of  as  a  slowly  time  variant  non-Gaussian  process  modulating  a  Gaussian  random  process.  The 
SIRP  is  a  special  case  of  the  exogenous  model,  arising  when  the  modulating  process  does  not 
change  rapidly  during  the  observation  interval  and  can  be  approximated  as  a  random  variable. 
This  is  due  to  the  fact  that  the  representation  theorem  for  SIRPs  allows  us  to  explicitly  write  the 
non-Gaussian  process  as  a  product  of  a  Gaussian  process  and  a  non-negative  random  variable. 
By  assuming  statistical  independence  between  the  modulating  random  variable  and  the  Gaussian 
process,  it  is  possible  to  independently  control  the  non-Gaussian  envelope  PDF  and  its  correlation 
properties  .  The  SIRP  is  the  only  known  case  of  the  exogenous  multiplicative  model  which  allows 
the  specification  of  the  Nth  order  PDF. 

Section  4.2  outlines  the  problem  of  interest.  In  Section  4.3  we  present  several  techniques  to 
obtain  SIRVs.  Examples  based  on  various  techniques  described  in  Section  4.3  are  used  to  obtain 
a  library  of  SIRV  PDFs  in  Section  4.4.  Finally,  in  Section  4.5,  we  present  a  key  result  which 
characterizes  SIRVs  by  using  the  quadratic  form  appearing  in  their  PDFs. 

4.2  Problem  Statement 

We  assume  we  are  dealing  with  coherent  radar  clutter.  By  coherent  radar  clutter,  we  mean 
that  the  clutter  is  processed  in  terms  of  its  in  phase  and  out  of  phase  quadrature  components. 
Pre-detection  radar  clutter,  being  a  bandpass  random  process,  admits  a  representation  of  the 
form 

y(t)  =  Re{y{t)exp(ju0t)}  (4.1) 

where  y{t)  =  yc(t)  4-  jya(t)  denotes  the  complex  envelope  of  the  clutter  process,  ujq  is  a  known 
carrier  frequency,  t/c(0  and  ya(t )  denote  the  in  phase  and  out  of  phase  quadrature  components 
of  the  complex  process  y(t).  Equation  (4.1)  can  be  rewritten  as 

y(t)  =  yc(t)cos(u0t)  -  ya(t)sin(uQt).  (4.2) 

We  are  interested  in  specifying  the  joint  PDF  of  N  samples  obtained  by  sampling  the  process 
y{t).  Since  it  is  always  more  convenient  to  work  with  the  associated  low  pass  process,  we  consider 
the  equivalent  problem  of  specifying  the  PDF  of  N  complex  samples  obtained  from  the  complex 
process  y(t).  The  PDF  of  a  complex  random  variable  is  defined  to  be  the  joint  PDF  of  its  in 
phase  and  out  of  phase  quadrature  components.  Therefore,  it  follows  that  the  joint  PDF  of  N 
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complex  random  variables  is  the  joint  PDF  of  the  2N  in  phase  and  out  of  phase  quadrature 
components.  While  dealing  with  complex  random  variables,  it  sometimes  more  convenient  to 
work  with  their  envelope  and  phase.  The  envelope  R  and  phase  0  of  a  complex  random  variable 
Yi  =  Yd  +  jY,{  are  defined  by 


0;  =  arctan  (]£“•). 


We  consider  the  problem  of  specifying  the  PDF  of  a  random  vector  Yr  =  [Ycr:YBT]  obtained 
by  sampling  the  random  process  y(t),  where  Yc  =  [Yci,  Yc2,  ...,  YcN]T  and  YB  =  [Vm,  Ys2,  . . . ,  YtN]T. 
The  subscripts  c  and  s  denote  the  in  phase  and  out  of  phase  quadrature  components,  respectively. 
We  assume  that  the  process  y(t,)  is  a  wide  sense  stationary  random  process.  The  necessary  and 
sufficient  conditions  for  y(t)  to  be  temporally  wide  sense  stationary  [42]  are: 

(A)  The  quadrature  components  have  zero  mean. 

(B)  The  envelope  of  the  pair  wise  quadrature  components  is  statis¬ 
tically  independent  of  the  phase  and  the  phase  is  uniformly  dis¬ 
tributed  over  the  interval  (0, 27r).  This  results  in  the  pair  wise 
quadrature  components  being  identically  distributed  and  their 
joint  PDF  being  circularly  symmetric.  This  also  results  in  the 
orthogonality  of  the  pair  wise  quadrature  components  at  each 
sampling  instant. 

(C)  The  autocovariance  function  and  crosscovariance  function  of  the 
quadrature  processes  of  the  complex  process  y(t)  =  yc(t)  + 
jyt(t)  satisfy  the  conditions  given  by 


Kcc(t)  =  K„(t) 

K„(t)  =  - Kte(r ) 


(4.4) 
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where 


K'C(t)  =  E{Xc(t)Xc(t  -  r)} 

K„(t)  =  E{X.(t)X,(t  -  t)} 

(4.5) 

Ket(r)  =  E{Xc(t)X,(t  -  r)} 

K,c(t)  =  E{X,{t)Xc{t-T)}. 

Also,  the  nonnegative  definite  property  of  the  covariance  matrix 
of  Y  must  be  satisfied. 

(D)  Any  choice  of  autocovariance  and  crosscovariance  functions  is  al¬ 
lowed  as  long  as  requirement  (C)  is  satisfied. 

Due  to  requirement  (A),  it  follows  that  E( Y)  =  0.  Hence,  E{ Yc)  —  E( Y8)  =  0.  As  a 
consequence  of  requirements  (B)  and  (C),  the  covariance  matrix  of  Y,  given  by 


must  satisfy  the  conditions: 


£  = 


Sec  —  5^88 
SC8  =  —  S8C 


(4.6) 


(4.7) 


with  the  elements  of  the  main  diagonal  of  the  matrices  SCa  and  £8C  being  equal  to  zero.  Note 
that  See  =  E{YcYj],  SC8  =  £{YcY8t},  S8C  =  i?{Y8Yj}  and  S88  =  £{Y8Yj}.  Finally,  we 
point  out,  regardless  of  the  value  of  N,  we  always  have  an  even  order  PDF  when  dealing  with 
quadrature  components.  We  are  now  in  a  position  to  proceed  with  the  characterization  of  Y  as 
an  SIRV. 

For  an  SIRV,  it  is  pointed  out  that  the  PDF  of  a  given  order  automatically  implies  all  lower 
order  PDFs.  For  example,  if  N  random  variables  are  jointly  Gaussian,  it  is  well  known  that 
the  ith  order  PDF,  i  =  1,  2,  ...,N  —  1  is  multivariate  Gaussian.  This  property  of  SIRVs  is 
called  internal  consistency.  The  requirements  (A)-(D)  arising  from  the  wide  sense  stationarity 
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requirements  of  the  process  y(t)  are  called  external  consistency  conditions.  Requirements  (A)-(D) 
are  not  inherent  to  the  SIRP  and  do  not  hold  when  the  SIRP  is  not  wide  sense  stationary. 

4.3  Techniques  for  Determining  the  SIRV  PDF 

In  this  section,  several  techniques  are  presented  for  obtaining  hjiv(p)  For  convenience,  tem¬ 
poral  wide  sense  stationarity  of  the  underlying  bandpass  process  is  assumed.  However,  the 
functional  form  of  'i2/v(.)  is  unaffected  whether  or  not  the  random  process  is  temporally  wide 
sense  stationary.  Hence,  it  is  allowable  to  let  p  =  (y  —  b)rE-1(y  —  b)  in  the  final  result,  where 
b  is  any  mean  non-zero  vector  and  E  is  any  non-negative  definite  matrix. 

Recall  from  Chapter  2  that  the  PDF  of  an  SIRV  YT  =  [Ycr:Ye]  with  Yc  and  Ya  defined  in 
Section  4.2  is  given  by 

M y)  -  (4.8) 

Assuming  temporal  wide  sense  stationarity,  p  =  yTE_1y  where  E  is  given  by  eq  (4.6).  The 
mean  vector  of  Y  is  zero  due  to  requirement  (A)  in  Section  4.2.  The  covariance  matrix  E  having 
the  form  of  eq  (4.6)  and  satisfying  the  requirements  of  eq  (4.7)  is  readily  determined  when  the 
autocorrelation  function  of  the  process  is  specified.  Given  E,  several  techniques  for  obtaining 
h2 n(p)  are  presented  in  this  section. 

The  representation  theorem  for  SIRVs  allows  us  to  express  Y  as  a  product  of  a  Gaussian 
random  vector  Z,  having  the  same  dimensions  as  Y  and  a  non-negative  variable  S.  For  the 
problem  of  radar  clutter  modeling,  since  it  is  desirable  to  control  the  non-Gaussian  nature  of  Y 
and  its  correlation  properties  independently,  we  assume  that  the  random  variable  S  is  statistically 
independent  of  Z.  In  addition,  the  covariance  matrix  of  the  SIRV  can  be  made  equal  to  the 
covariance  matrix  of  the  Gaussian  random  vector  by  requiring  E(S 2)  to  be  unity.  Finally,  it  is 
pointed  out  that  the  mean  of  Z  is  necessarily  zero. 

A  physical  interpretation  can  be  given  to  Z  and  S.  Consider  a  surveillance  volume  subdivided 
into  contiguous  range- Doppler-azimuth  cells.  Assuming  a  large  enough  cell  size  such  that  many 
scatterers  are  located  in  each  cell,  the  N  pulse  returns  from  a  given  cell  can  be  modeled  as 
the  Gaussian  vector  Z  due  to  the  central  limit  theorem.  Also  assume  that  the  average  clutter 
power  remains  constant  over  the  N  pulse  returns  in  a  coherent  processing  interval.  However, 
the  average  clutter  power  is  allowed  to  vary  independently  from  cell  to  cell  since  different  sets  of 
scatterers  are  located  in  each  cell.  The  variation  of  the  average  clutter  power  from  cell  to  cell  is 
modeled  by  the  square  of  the  non-negative  random  variable  S. 
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4.3.1  SIRVs  with  Known  Characteristic  PDF 

YVe  consider  specification  of  the  PDF  of  the  SIRV  Y  when  its  characteristic  PDF  is  known 
in  closed  form.  We  have  pointed  out  in  the  previous  section  that  the  mean  vector  of  Y  is  zero. 
Also,  we  have  discussed  the  specification  of  the  covariance  matrix  of  Y.  We  now  focus  on  the 
specification  of  h2^(p).  As  a  consequence  of  the  representation  theorem,  we  can  write 

hMp)  =  Jn  s~3NexP(-^)fs(s)-  (4.9) 

Equation  (4.9)  enables  us  to  specify  h,2h(p)  when  the  characteristic  PDF  /$(s)  is  known  in 
closed  form.  However,  in  some  cases,  even  though  an  analytical  expression  is  known  for  the 
characteristic  PDF,  it  may  be  difficult  to  evaluate  the  integral  in  eq  (4.9)  in  closed  form.  In  such 
instances,  an  alternate  method  for  specifying  h2N(p)  must  be  examined.  The  method  presented 

in  the  next  section  is  useful  for  these  cases. 

4.3.2  SIRVs  with  Unknown  Characteristic  PDFs 

When  the  characteristic  PDF  of  the  SIRV  is  unknown  or  when  the  integral  in  eq  (4.9)  is  difficult 
to  evaluate,  we  propose  an  alternate  method  to  obtain  h2v(p).  Recall  that  we  are  dealing  with 
an  even  order  PDF.  Therefore,  we  can  use  eq  (3.25)  starting  with  h2{w)  to  obtain  li2l^(w).  It  is 
worthwhile  pointing  out  that  h2(.)  is  related  to  the  first  order  envelope  PDF.  From  requirement 
(B)  of  Section  3.2,  the  joint  PDF  of  the  ith  in  phase  and  out  of  phase  quadrature  components 
can  be  expressed  as 


fYci,Y.i(y «>  y*i)  =  (27r)  2Mp)  («  =  2,  (4.10) 

where  p  =  and  cr2  denotes  the  common  variance  of  the  in  phase  and  out  of  phase 

quadrature  components.  The  envelope  and  phase  corresponding  to  the  ith  quadrature  components 
is  given  by 

Ri  =  M  +  Yi 

(4.11) 

0,  =  arctan  £“•. 

*ci 

Due  to  the  assumption  of  wide  sense  stationarity,  we  can  drop  the  subscript  i  in  eq  (4.11).  The 
Jacobian  of  the  transformation  given  by  eq  (4.11)  is  J  —  R"1,  where  J  den')te3  the  Jacobian. 
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Using  the  Jacobian  in  eq  (4  10)  results  in  the  joint  PDF  of  R  and  0  being  given  by 


/n,e(r,  0) 


r 

27T<7a 


(4.12) 


Clearly,  the  joint  PDF  in  eq  (4.12)  can  be  factored  as  a  product  of  the  marginal  PDFs  of 
the  random  variables  R  and  0.  Consequently,  the  random  variables  R  and  0  are  statistically 
independent  with  PDFs  given  by 


fn(r)  -  $rh2(£)  (0  <  r  <  oo) 

fe{6)  «  (27T)-1  (0  <6  <  2tt). 


Equation  (4.13)  relates  the  envelope  PDF  to  h2(.).  Hence,  we  can  write 


(4.14) 


Thus,  eq  (4.14)  providts  a  mechanism  to  obtain  h2{w).  Starting  from  h2(w),  we  then  use  eq  (3.25) 
to  obtain  h2^(w).  Since  not  all  non-Gaussian  envelope  PDFs  are  admissible  for  characterization 
as  SJRVs,  we  must  check  that  h2(w)  and  its  derivatives  satisfy  the  monotonicity  conditions  stated 
in  Chapter  2.  Finally,  /i2/v(p)  is  obtained  by  simply  replacing  w  by  p  =  (y  —  b)TE-1(y  —  b)  in 

*W(u>)- 

4.3.3  Hankel  Transform  Approach 

In  this  section  we  present  an  approach  based  on  the  Hankel  transform  for  specifying  /»2/v(p). 
Recall  that  the  joint  PDF  of  the  ith  in  phase  and  out  of  phase  quadrature  components  of  Y  is 
given  by  eq  (4.10).  For  convenience,  it  is  assumed  that  cr 2  is  unity.  Dropping  the  subscript  i 
from  eq  (4.10),  the  joint  characteristic  function  of  Yci  and  Yai  is  expressed  as 


/oo  r  oo 

/  exp(juiyc  +  joj2ys)h2{yl  +  y])dycdy„.  (4.15) 

-oo  */— OO 


56 


Introduci  ng  the  transformations 


R  =  yJif+K1 

0  =  arctan 
u  =  +  <v‘j 

a  =  arctan  j4-1 

we  can  rewrite  eq  (4.15)  as 


(4.16) 


^>'.,V'»(wit  Wa)  =  (2w )— 1  f  t  exp\ji^r{cos(S)cos(a)  +  3m(6)am(a)}]rA3(r3)<lr<10.  (4.17) 

JO  JO 


Noting  that  cos(A  —  B)  =  cos(j4)cos(Z?)  +  sin(A)sin(B),  we  can  rewrite  eq  (4.17)  as 

Jroo  p2ir 

'  /  ezp[;u;rco«9(0  -  a)]r/&2(r2)<fr  (4*18) 

Interchanging  the  order  of  integration  in  eq  (4.18),  and  recognizing  that  [45] 

M*)  ®  ^  Jo  exPtixc°*(0  ~  ~l)W,  (4.19) 

where  Jo(x)  is  the  Bessel  function  of  order  zero,  we  have 

too 

<f>Ye,Y,(vu  u2)  =  /  rh2{r2)J0(u)r)dr.  (4.20) 

From  eq  (4.20),  it  is  clear  that  the  joint  characteristic  function  of  Ye  and  Ya  is  a  function  of 
u  =  yju *  +  u;2.  Hence,  it  is  a  circularly  symmetric  characteristic  function.  Denoting  this  function 
by  ^(ai),  we  can  write 

too 

®(w)  =  rh2{r2)J0(ur)dr.  (4.21) 

Equation  (4.21)  is  recognized  as  the  Hankel  transform  of  order  zero  of  /i2(r2).  Using  the  inverse 
Hankel  transform,  we  obtain 


h2  (r2)  =  f  (uj)Jo(ujr)duj . 
Jo 


(4.22) 
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Introducing  the  dummy  variable  tu,  we  can  write 


h2(w)  =  f  ijj^(u!)Jo(uj\/w)duj.  (4.23) 


We  then  use  eq  (3.25)  to  obtain  h2N(w).  More  explicitly,  we  can  write 


/oo  JAM 

h2N(w)  =  (  2 )N  1  fQ 

(4.24) 

Using  the  identity  [45] 

(4.25) 

we  have 

dJ0{u)y/w)  u  _4  T  , 

dm  =  2W 

(4.26) 

Use  of  the  recurrence  relation  [45] 

=  -v~aJ°+ i(v) 

(4.27) 

results  in 

£[**+&)  =  ^-{s/w)~2  J2{us/w). 

(4.28) 

Repeated  use  of  eq  (4.27)  gives 

JN-I  N-l 

dwN-l  [*4>(uVw)]  =  (  1)W  1-2yv-l(v^)  N+1Jn-i(^Vw). 

(4.29) 

Substituting  eq  (4.29)  in  eq  (4.24)  gives 


/•OO 

h2 n(w)  =  {V™Y~N  /  wN '&(u)JN-i(u>y/w)du.  (4.30) 

Finally,  h2N(p)  is  obtained  from  eq  (4.30)  by  replacing  w  by  p  =  (y  —  b)r£-1(y  —  b).  This 
completes  the  proof  of  eq  (3.27)  for  even  values  of  N  which  had  been  previously  deferred.  The 
integral  in  eq  (4.30)  is  recognized  as  the  Hankel  transform  of  order  N  —  1  of  »P(u>).  A  number 
of  Hankel  transforms  have  been  provided  in  [46]  and  these  will  be  made  use  of  in  the  examples 
presented  in  Section  3.4. 
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4.4  Examples  of  complex  SIRVs 

This  section  presents  examples  based  on  the  approaches  discussed  in  Section  4.3  and  is  divided 
into  three  parts.  In  section  4.4.1,  we  present  examples  that  assumes  the  knowledge  of  the 
characteristic  PDF.  In  Section  4.4.2,  the  marginal  envelope  PDF  is  assumed  to  be  known  whereas 
in  Section  4.4.3,  knowledge  of  the  marginal  characteristic  function  is  assumed.  Finally,  in  4.4.4 
we  point  out  some  univariate  PDFs  that  cannot  be  generalized  to  SIRV  characterization.  We 
consider  the  problem  of  determining  the  PDF  of  the  random  vector  YT  —  [YCT:YB]  specified  in 
Section  4.2.  It  is  assumed  that  the  mean  vector  of  Y  and  its  covariance  matrix  S  are  known. 
Consequently,  specification  of  the  PDF  of  Y  of  the  form  of  eq  (4.8)  reduces  to  determination  of 

h’jw(p). 

4.4.1  Examples  Based  on  the  Characteristic  PDF 

4.4. 1.1  Gaussian  Distribution 

The  Gaussian  marginal  PDF  for  the  quadrature  components  having  mean  6*  and  variance  cr]. 


is 


fYk{yk) 


—exp( — ^k~2~T  ~ )  oo). 


Ok 


(4.31) 


The  characteristic  PDF  for  this  example  is  given  by 


/»(»)  =  *(»  -  1) 


(4.32) 


where  S(.)  is  the  unit  impulse  function.  Using  eq  (3.16),  it  is  seen  that  the  resulting  hw(p)  is 
given  by 

MP )  =  exP(~^)‘  (4-33) 

where  p  =  (y  —  b)TE_1(y  —  b'.  The  corresponding  PDF  for  any  N  is  given  by  eq  (3.15).  For 
N  =  1,  this  result  reduces  to  eq  (4.31).  While  dealing  with  quadrature  components,  we  obtain  the 
the  corresponding  /i2Ar(p)  by  simply  replacing  N  by  2 N  in  eq  (4.33).  Whenever  a  characteristic 
PDF  can  be  made  to  approach  a  unit  impulse  function  displaced  to  the  right  of  the  origin  by 
appropriate  choice  of  its  parameters,  it  follows  that  the  corresponding  SIRV  PDF  will  approach 
the  Gaussian  PDF. 
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4.4. 1.2  K-Distribution 

The  K-distributed  envelope  PDF,  by  definition,  is  given  by 

/fl(0  =  ^(y)atfa-i(*>rMr)  (4.34) 

where  a  is  the  shape  parameter  of  the  distribution,  b  denotes  the  scale  parameter  of  the  distribu¬ 
tion,  Kfir(t)  is  the  Nth  order  modified  Bessel  function  of  the  second  kind  and  u(r)  is  the  unit  step 
function.  The  K-distributed  envelope  PDF  is  commonly  used  for  modeling  radar  clutter  PDFs 
that  have  extended  tails  [32]-  [33]  and  [15]-[22].  In  particular,  the  PDF  becomes  heavy  tailed  as 
a  approaches  zero.  Plots  of  eq  (4.34)  for  several  values  of  a  are  shown  in  Figures  4. 1-4.4. 

The  K-distributed  envelope  PDF  arises  when  we  consider  the  product  of  a  Rayleigh  distributed 
random  variable  R'  and  an  independent  random  variable  V  having  the  generalized- Chi  distri¬ 
bution.  More  precisely,  we  consider  the  product  R  =  R'V ,  with  R'  and  V  being  statistically 
independent.  Their  PDFs  are  given  by 

fR>(r  )  =  r' exp(-^j-)  0  <  r  <  oo  (4.35) 

and 

Oh 

fv{v)  =  r(a)2a  (H2a~lexP( — 2-)u(v)>  (4-36) 

respectively.  Consequently,  the  PDF  of  R  is  given  by 

/fl(r)  =  /o°°  fR\v(r\v)fv(v)dv 

(4.37) 

—  /o°°  $exP(~tt)r(^ibv)2a~xexp(-^Y-). 

From  [45],  we  have 

fOO  r  2^  7T 

Ku(xz)  a*  y  Jo  ea:p[— — (<  +  —))t~v~ldt  [ \argz\  <  -],  z  >  0.  (4.38) 

Letting  v2  =  t  in  eq  (4.37)  and  using  the  result  of  eq  (4.38),  the  PDF  of  eq  (4.34)  follows. 

As  a  matter  of  interest,  we  demonstrate  the  derivation  of  the  PDF  for  the  quadrature  compo¬ 
nents  arising  from  the  K-distributed  envelope  PDF.  The  quadrature  components  corresponding  to 
the  Rayleigh  envelope  PDF  //j'(r  ),  are  independent  identically  distributed  zero  mean  Gaussian 
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random  variables  having  unit  variance.  The  PDF  of  the  quadrature  components  corresponding 
to  B!  is  expressed  as 

fzdz)  =  fz.(z)  =  (27r)"»ea:p(-y)  (4.39) 

where  Zc  and  Z,  denote  the  in  phase  and  out  of  phase  quadrature  components.  The  quadrature 
components  arising  from  the  K-distributed  envelope  PDF,  denoted  by  Yc  and  Ya ,  respectively, 
can  be  expressed  as 


Yc  =  ZCV 
K  =  Z.V. 


(4.40) 


Note  that  \Y\  =  \Z\  and  ©$>  =  0^.  Consequently,  the  PDF  of  Yc  is  given  by 


fyM  =  +  bV))dv.  (4.41) 

Making  the  change  of  variables  t  =  b2v 2  and  z 2  —  b2y2,  and  using  the  result  of  eq  (4.38),  the 
PDF  of  Yc  is  expressed  as 

Mtfc)  =  p(a^^-|ftycr~»Arl-a(fe|yc|)  -oo<yc<oo  (4.42) 

where  the  absolute  value  denoted  by  |.|  is  used  on  account  of  the  requirement  that  z  >  0.  In  a 
similar  manner,  it  can  be  shown  that  the  PDF  of  Ya  has  the  same  functional  form  as  eq  (4.42). 
The  PDF  of  eq  (4.42)  is  called  the  Generalized  Laplace  PDF  [29]. 

The  characteristic  PDF  for  the  K-distributed  SIRV  is 


0 

fs(*)  =  r(a)2^6^2a~le3^ — 2~ )u(5)-  (4-43) 

Using  eqs  (3.16)  and  (4.38), 

Jroo  «  O  A2o2 

[  3~NexP(~^  r('a)2a  (ybs)2a~lex^ - Y^3'  (4-44) 

Making  the  change  of  variables  t  =  b2s2  and  z2  =  62p,  the  resulting  h^(p)  is  given  by 

Mp)  =  (*•«) 
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The  corresponding  SIRV  PDF  for  any  N  is  given  by  using  eq  (3.15).  For  the  case  when  N  =  1, 
this  reduces  to  eq  (4.42).  When  dealing  with  quadrature  components,  we  use  eq  (4.45)  with  N 
replaced  by  2 N 

4.4.1. 3  Student-t  Distribution 

The  Student-t  distribution  for  the  quadrature  components  is  given  by 


Myk)  =  +  (-°°  <  *»  <  “).  »  >  0 


(4.46) 


where  b  is  the  scale  parameter,  v  is  the  shape  parameter  T(i/)  is  the  Eulero-Gamma  function  and 
k  =  c,  s.  Plots  of  the  Student-t  distribution  are  shown  for  several  values  of  u  in  Figures  4. 5-4. 7. 
The  characteristic  PDF  for  this  example  is 


fs(s)  = 

Use  of  eq  (3.16)  results  in  h^ip)  being  given  by 

2f  62T(v  -I-  f ) 


(4.47) 


^n(p)  — 


r  {vW+p)'"' 


(4.48) 


The  corresponding  SIRV  PDF  for  any  N  is  given  by  eq  (3.15).  For  N  =  1,  this  result  reduces  to 
eq  (4.46).  When  dealing  with  quadrature  components,  we  make  use  of  eq  (4.48)  with  N  replaced 
by  2N. 

4.4. 1.4  Mixture  of  Gaussian  PDFs 

An  interesting  non-Gaussian  marginal  PDF  that  is  admissible  as  an  SIRV  is  the  mixture  of 
Gaussian  PDFs.  We  consider  the  PDF  given  by 


fvM  =  ai(2irkf)~hxp(- 


2  k? 

•  > 
for  the  quadrature  components  of  Y.  The  characteristic  PDF  for  this  example  is  given  by 


(4.49) 


fs(s)  =  -  fct)- 


(4.50) 
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Note  that  S'  is  a  discrete  random  variable,  with  a,  denoting  the  probability  P(S  =  k{).  Also,  it 
is  required  that 


a,-  >  0  i  —  1,  2, . . . 

ZTi  =  l  • 


(4.51) 


Using  eq  (3.16),  it  is  seen  that 


Mp)  =  £  k~Naiexp( -gp)-  (4-52) 

The  corresponding  SIRV  PDF  for  any  N  is  given  by  eq  (3.15).  For  N  =  1,  this  result  reduces 
to  eq  (4.49).  When  dealing  with  quadrature  components,  we  make  use  of  the  result  of  eq  (4.52) 

with  N  replaced  by  2 N.  Note  that  the  a,-’s  can  be  assigned  any  convenient  discrete  distribution. 

4.4.2  Examples  Based  on  Marginal  Envelope  PDF 

We  shall  report  here  on  some  new  SIRV  PDFs  obtained  starting  from  the  marginal  envelope 
PDF.  In  general,  note  that  the  characteristic  PDF  for  all  the  examples  considered  here  are  not 
available  in  closed  form.  Since  a1  is  the  common  variance  of  the  in  phase  and  out  of  phase 
quadrature  components,  a1  is  equal  to  \E{R2).  In  addition,  recall  that  the  binomial  coefficient 
is  defined  by 


In  all  the  examples  in  this  section,  we  start  with  h2(w)  and  obtain  h2jsf(w)  by  the  process  of 
successive  differentiation.  The  corresponding  h2s{p)  for  each  example  is  obtained  by  replacing 
w  by  p  in  h2N(w).  In  all  the  examples  presented  in  this  section,  note  that  the  envelope  PDFs 

reduce  to  the  Rayleigh  envelope  PDF  for  appropriately  chosen  parameters. 

4.4.2. 1  Chi  Envelope  PDF 

We  consider  the  Chi  distributed  envelope  PDF  given  by 

/*(»")  =  j^(^)2,/-1ezpM2r2)  (0  <  r  <  oo)  (4.54) 

where  6  denotes  the  scale  parameter  and  u  denotes  the  shape  parameter.  Plots  of  the  Chi 
envelope  PDF  are  shown  in  Figures  4.8-4.10  for  several  values  of  v.  Using  eq(4.14),  we  can 


/! 


«!(/-*)!' 


(4.53) 
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Figure  4.10:  Chi  Envelope  PDF,  b  =  0.5,  u 


write 


/^(w)  =  exP(~  b2cr2w). 


(4.55) 


Using  eq(3.25),  we  have 


W»)  =  (- 

= 

Recall  Leibnitz’s  theorem  for  the  nth  derivative  of  a  product  [45],  which  states  that 

/  \ 


d^juv)  _  A  1  n 
dxn  ^ 


k—0 


\  k  / 


dfett 

dxfc  dxn~k 


where  u  and  u  are  functions  of  x.  Noting  that 


jVg) . _rw_ 

du;fc  F(j/  —  k) 


it  follows  that 


where 


N 


h 2n(w)  =  (—2)s  1A^2Gk‘w1'  kexp(—Bw) 

k=l 


Gk  = 


A  = 


N-  1  | 

(_l)AT-^-A 

l  *-*  j 


r(-) 


r(i/-«  +  i) 


W 
r  =  6V. 


2i/ 


(4.56) 


(4.57) 


(4.58) 


(4.59) 


(4.60) 


An  important  condition  that  must  be  pointed  out  is  that  the  SIRV  PDF  is  valid  only  for  v  <  1. 
This  is  due  to  the  fact  that  h^p)  and  its  derivatives  are  monotonically  decreasing  functions  only 
in  the  range  of  values  of  v  mentioned  above.  Finally,  for  v  =  1,  note  that  the  Chi  envelope  PDF 
reduces  to  the  Rayleigh  envelope  PDF.  The  corresponding  SIRV  PDF  then  becomes  Gaussian. 
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4.4.2. 2  Weibull  Envelope  PDF 

The  Weibull  distributed  envelope  PDF  is  given  by 


fn(r)  =  abrh  1exp(—arb)  (0  <  r  <  oo). 


(4-61) 


where  a  is  the  scale  parameter  and  6  is  the  shape  parameter.  Plots  of  the  Weibull  distribution 
for  several  values  of  6  are  shown  in  Figures  4.12-4.14.  Using  eq  (4.14),  we  have 


h2(w)  =  abobw*~l  exp(-aabw?)  =  (— 2)— [exp{— Aw?)] 

aw 


(4.62) 


where  A  =  aab.  From  eq  (3.25),  we  have 


dN 


hN(v>)  = 


(4.63) 


The  rule  for  obtaining  the  Nth  derivative  of  a  composite  function  is  [45]:  If  f{x)  =  F(y)  and 
y  =  <p(x),  then 


dN  frt  X1  ^UkdktI?,„ 

-  £  t\w[f (»)' 


where 


Uk=  £(-i) 


k—m 


m=l 


\  m  } 


dxN 


(4.64) 


(4.65) 


Making  the  association  x  —  w  and  y  —  — Aw a ,  we  have 


N 


h2N{w)  =  ^2  CkW  *  ~Nexp(-Aw* ) 
k= i 


(4.66) 


where 


c„  =  E(-ir^2^ 

m=l 


A* 


m 


r(l  +  afe) 
r(i  +  r-f  -  n) 


(4.67) 


The  Weibull  envelope  PDF  is  admissible  for  characterization  as  an  SIRV  for  values  of  b  less  than 
or  equal  to  2.  This  is  due  to  the  fact  that  h2(w)  and  its  derivatives  fail  to  satisfy  the  monotonicity 
condition  for  other  values  of  b.  However,  this  is  not  a  serious  restriction  from  the  point  of  view  of 
radar  clutter  modeling  because  the  Weibull  envelope  PDF  is  of  interest  in  modeling  large  tailed 
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clutter.  Such  a  situation  arises  only  when  0  <  6  <  2.  The  Weibull  envelope  PDF  reduces  to 
the  Rayleigh  envelope  PDF  when  6=2.  The  corresponding  SIRV  PDF  then  becomes  Gaussian. 
Another  case  of  interest  arises  when  6=1.  In  this  case  the  Weibull  envelope  PDF  corresponds 

to  the  Exponential  envelope  PDF. 

4.4.2.S  Generalized  Rayleigh  Envelope  PDF 

The  next  PDF  considered  is  for  the  Generalized  Rayleigh  envelope  which  is  given  by 

Mr)  =  ftrf^exP[-(^)°]  (0  <  r  <  oo)  (4.68) 

where  a  is  the  shape  parameter  and  /?  is  the  scale  parameter.  Plots  of  the  Generalized  Rayleigh 
distribution  are  shown  for  several  values  of  a  in  Figures  4.15-4.18. 

Proceeding  as  in  the  previous  example,  we  find  that 

h2[w)  =  Aexp(—Bw » )  (4.69) 


where 


“  WW 

B  =  f3~a(ra 

Using  eqs  (3.25),  (3.63)  and  (3.64),  we  have 

N- 1 

£ 


Ajjv(uj)  —  ^  DkW*?~N+1exp(—Bw$) 


(4.70) 


(4.71) 


where 


A  =  £(-ir+w-v'-1^ 

m=l  ** 


(  k  ^ 


V  m  / 


rq-f  °») 

F(2  +  ~  -  N) 


(4.72) 


Note  that  the  SIRV  PDF  is  valid  only  in  the  range  (0  <  a  <  2).  This  is  because  of  the  fact 
that  the  monotonicity  conditions  for  the  derivatives  of  h2(p)  are  satisfied  only  for  the  specified 
range  of  a.  The  Generalized  Rayleigh  envelope  PDF  reduces  to  the  Rayleigh  envelope  PDF  when 
a  =  2. 
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4.4. 2.4  Rician  Envelope  PDF  Arising  from  a  zeromean  complex  Gaussian  Process  with 
correlated  quadrature  components 

There  are  two  possible  ways  in  which  the  Rician  envelope  PDF  occurs.  One  possibility  arises 
through  a  complex  zero  mean  random  process  with  correlated  quadrature  components  that  are 
Gaussian.  The  other  is  through  a  non-zero  mean  complex  Gaussian  process.  The  former  case  is 
considered  here,  since  the  SIRV  PDF  can  be  obtained  by  differentiation  of  h2(w).  For  this  case, 
the  envelope  PDF  is  given  by 


/*(’•) 


exp\- 


Ms; 


pr 4 


2(1  —  p3)J*ul2(l  —  p2)^ 
(0  <  r  <  oo) 

(0  <  p  <  1) 


(4.73) 


where  Iq(x)  is  the  modified  Bessel’s  function  of  the  first  kind  of  order  zero.  Plots  of  the  Rician 
envelope  PDF  for  several  values  of  p  are  shown  in  Figures  4.19-4.21.  Let 

_2 


A  = 


2(1 -pa¬ 


using  eq  (4.14)  we  have 


From  eq  (3.25) 


h2(w) 


■exp(— Aw)Io(pAw). 


L  l  \  ,  on/V-I^  1k2(w) 

h2N(w)  =  (- 2)" 


We  then  use  eq  (4.57)  and  the  identities  [45] 


(4.74) 


(4.75) 


(4.76) 


In(x)  =  —  /02ff  cos{nB)exp[xcos(B)]dO 


CO3k(0)  =  £ 


cos[(k  —  2  m)0] 


m 


(4.77) 


to  obtain 


h2N(w)  = 


a2  N  N-l 

(i  -  (.=)"-)  s 


/  N  -  1  ^ 


(-!)*(§  )%exp[-Aw) 


( 4.78) 
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where 


6=E 

m=Q 


vm/ 


Ik-7m{pAw). 


(4.79) 


For  p  —  0,  note  that  the  Rician  envelope  PDF  corresponds  to  the  Rayleigh  envelope  PDF. 

4.4.3  Examples  Using  the  Marginal  Characteristic  Function 

Successful  use  of  the  marginal  characteristic  function  approach  requires  the  knowledge  of  vari¬ 
ous  Hankel  transforms.  For  each  example,  the  particular  transform  used  is  cited  by  equation  and 
page  number  as  it  appears  in  [46].  To  illustrate  the  procedure  followed,  a  detailed  derivation  is 
presented  in  the  first  example.  However,  in  the  remaining  examples,  we  simply  list  the  univariate 
characteristic  function  of  the  quadrature  components,  the  corresponding  marginal  PDF  and  the 

resulting  h2w(w).  Finally,  A2jv(p)  is  obtained  by  replacing  w  with  p  in  the  expressions  for  /i2Jv(«’)* 
4.4.3. 1  Gaussian  Distribution 


First,  we  consider  the  characteristic  function  given  by 

=  exp(-y). 


(4.80) 


The  corresponding  marginal  PDF  of  the  quadrature  components  is 

1  yl 

fvM  ^ -/~=exp(-f)  (-00  <yk<  oo). 

y/(2ir)  2 


(4.81) 


Equation  (4.81)  is  the  PDF  of  a  zero  mean  unit  variance  Gaussian  random  variable.  Substitution 
of  eq  (4.80)  iri  eq  (4.30)  yields 

2 

^2 n(u>)  =  (y/wY~N  f  ujN exp(-%-)JN~\(w\/u))dijj.  (4.82) 

Jo  2 

From  [46],  eq  (10),  p29,  we  have  the  Hankel  transform 

JQ  xv+hxp(-ax2)Jv(xy)y/^ydx  =  ^ajvTi  exP(~f^)-  (4-83) 

By  making  the  association  that  a  =  0.5,  u  —  N  —  l,  x  =  u)  and  y  —  y/w,  the  above  result  becomes 

2 

f  wNexp(—  ~)J^^i(ujy/w)y/wdu;  =  y/wN~1+^exp(— ^).  (4.84) 

Jo  2  2 
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It  follows  that 


h2N(w)  =  exp(-j).  (4.85) 

Prom  eq  (4.1),  it  is  seen  that  the  resulting  SIRV  PDF  is  the  familiar  multivariate  Gaussian  PDF, 
given  by 

M»)  =  (2>r)-''|x:rW<-§)-  (4.86) 

4.4.3. 2  K-Distribution 

The  marginal  characteristic  function  given  by 

*(«)  =  (1  +  £)~  (4.87) 

corresponds  to  the  K-distributed  envelope  whose  PDF  is 

Mr)  =  y||y< (fe>(r)  (4.88) 

where  a  is  the  shape  parameter  of  the  distribution,  b  denotes  its  scale  parameter,  K^(t)  is  the 
Nth  order  modified  Bessel  function  of  the  second  kind  and  u(r )  is  the  unit  step  function.  The 
pertinent  Hankel  transform  for  this  example  is  found  as  [46]  eq  (20),  p24: 


/  xvhi(x2  +  a2)-"  1Jv(xy)s/xydx  = 
Jo 


a"  uyu+?/ft._u(qy) 

2«r(u  +  1) 


(4.89) 


The  resulting  h2N(w)  is 

l27V  /  jl  \0t~N 

h2N(w)  =  — — - KN-a(by/w).  (4.90) 

As  a  special  case,  when  a  is  equal  to  unity,  eq  (4.87)  is  the  characteristic  function  of  the 
Laplace  distribution  for  the  quadrature  components  whose  PDF  is  Jven  by 


frk(yk)  =  ~exp(-b\yk\)  (-00  <yk<  oo)  (4.91) 

where  |j/*|  denotes  the  absolute  value  of  yk  and  6  denotes  the  scale  parameter.  The  corresponding 
h2N(w)  is  given  by 

h2N(w)  =  b2N(by/w)l~NKN-x(by/w).  (4.92) 

Another  interesting  case  of  the  K-distribution  arises  when  a  =  0.5.  Since  Kk(t)  ~  yj^exp(~ t), 
this  corresponds  to  the  exponential  distribution  for  the  marginal  envelope  PDF.  Therefore,  the  K- 


distributed  envelope  PDF  with  a  =  0.5  is  identical  to  the  Weibull  distributed  envelope  with  6=1. 
Although  the  characteristic  PDF  of  the  Weibull  SIRV  is  unknown  in  general,  the  characteristic 
PDF  of  the  Weibull  SIRV  for  6  =  1  is  obtained  when  a  =  0.5  in  eq  (4.43).  Finally,  we  point  out 

that  the  K-distributed  envelope  reduces  to  the  Rayleigh  envelope  PDF  when  a  tends  to  oo. 

4.4. 3.3  Student-t  Distribution 

The  characteristic  function  for  the  Student-t  distribution  with  scale  parameter  6  and  shape 
parameter  v  is  given  by 


tf„(6w)(6w)* 


(4.93) 


2*'-ir(i/) 

Note  the  functional  similarity  with  the  envelope  PDF  given  by  eq  (4.88).  The  Student-t  distri¬ 
bution  is  referred  to  as  the  generalized  Cauchy  distribution  in  [47]  because  the  marginal  PDF  of 
the  quadrature  components  is  given  by 


frM  =  +  -  **  -  “)• "  > 0 


(4.94) 


where  P(V)  is  the  Eulero-Gamma  function.  The  relevant  Hankel  transform,  [46]  eq  (3),  p63  is 

f°°  U+V+Lrs  ,  w,  x  r-j  2*'+“a,T(u  +  v  +  l)yv+*  ,AilKX 

J0  x  +  2  Ku(ax)Jv\xy)y/xydx  = - (y2  +  QJju+v+l - •  (4-95) 

Using  eq  (4.30),  h2^(w)  is  expressed  as 

2"63*T(t/  +  AO 


h2N{w) 


r(x/)(6 2 +  w)N+^‘ 


(4.96) 


The  Cauchy  PDF  for  the  quadrature  components  arises  when  u  is  set  equal  to  £  in  eq  (4.94) 
and  is  given  by 

b 

fyh(yk)  =  (_0°  ^  ^  °°)  (4-97) 

where  6  is  the  scale  parameter.  The  corresponding  h2 yv(u>)  is 


h2N(w) 


2^(1  +  N) 
■</%(&  +  w)N+* 


(4.98) 


Note  that  the  Cauchy  PDF  does  not  have  finite  variance.  However,  this  PDF  is  useful  in  modeling 
impulsive  noise  [48].  Finally,  we  point  out  that  when  6  =  y/2v  and  u  tends  to  oo  in  eq  (4.94), 
the  Student-t  distribution  reduces  to  the  Gaussian  distribution. 
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4.4.S.4  Rician  Envelope  PDF  arising  from  a  non-zero  mean  complex  Gaussian  Process 

We  consider  the  Rician  envelope  PDF,  arising  from  a  non-zero  mean  complex  Gaussian  process, 
given  by 

Mr)  =  ^exp[— V--^--^]/o(— ).  (4.99) 

Plots  of  the  Rician  envelope  PDF  are  shown  in  Figures  4.20-4.22  for  a  =  1  and  several  values  of 
a.  Note  that  this  PDF'  approaches  the  Rayleigh  PDF  as  a  tends  to  zero.  For  convenience,  we 
assume  that  cr3  =  %E(R?)  =  1.  Using  eq  (4.14),  we  have 

fg  f" 

ha(r2)  =  (4.100) 

where  A  —  exp^'~^r\  Noting  that  [45] 

f™ xexp(-otx2)Il,(px)Ju('yx)dx  -  ^exp(^~^)4(g) 

(4.101) 

Re{ac }  >  0,  Re{i/}  >  —1, 
eq  (4.21)  results  in  the  characteristic  function 

*(“>)  =  exp( - J~)M^  )•  (4.102) 

Recognizing  that  [45] 


/0°°  xx~1exp(— ax2)JM(j8x)J„(‘yx)dx  = 

Barrel  ^  V°°  r(m+a  +  nat+ a+*)  (  g)wP(  m  -  m-  J/  +  1* 

/?e{a}  >  0,  i2e{p  +  v  +  A}  >  —2,  /?  >  0,  7  >  0 


(4.103) 


where  F(.,  .;  .;  .)  is  the  four  parameter  hypergeometric  function,  it  follows  from  eq  (4.30)  that 

v2JV+2  «>  r(mfJV  +  l).-a2 


hiN(w)  =■ 


a‘ 


war 


22N+1T(N)  m!r(m  +  1)  v2q6 


(srs-r^(->».  -”>s  N •  -3-) 


(4.104) 


Since  h2s(w)  for  this  example  involves  an  infinite  series  of  hypergeometric  functions,  form  is 
mathematically  intractable.  Therefore,  the  corresponding  multivariate  SIRV  PDF  does  not  lend 
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Figure  4.24:  Rician  Envelope  PDF,  a  s=  0.9,  a 
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Table  4,1:  Marginal  PDF 


Marginal  PDF 


abx'texpf- 


exp^S’Pj 


Chi 


dPjFT - - 

7fe?g*p[-a^7»r]/otg^7i' 


1 


Wcibull 

Generalixed  kayleigh 


ax 


fsan 


Rician 


>/2*~  earp(-V) 


Gaussian 


Laplace 


Cauchy 


wmam 


see 


TO 


K-distribution 


Student-t 


itself  for  use  in  practical  applications. 

4.4.S.5  Summary 

The  results  derived  in  this  section  are  summarized  here.  As  a  point  of  interest,  it  is  mentioned 
that  the  log-normal  envelope  PDF  given  by 

Mr)  =  (4.105) 

and  the  Johnson  (unbounded)  distribution  whose  PDF  is  given  by 


friy)  = 


1  -exp\ 

V^Vf+F  2£J 


(4.106) 


cannot  be  extended  to  SIRVs  because  ha  (to)  for  each  of  these  distributions  fails  to  satisfy  the 
monotonicity  conditions  stated  in  section  4.3. 

Table  4.1,  presents  a  list  of  marginal  PDFs  suitable  for  extension  to  SIRVs.  Table  4.2  tabulates 
hw(p)  for  those  marginal  PDFs  treated  as  envelope  PDFs  while  Table  4.3  gives  those  hw(p) 
obtained  from  the  associated  marginal  characteristic  function. 

Plots  of  eq  (4.8)  with  N  -  1  for  the  various  SIRV  PDFs  are  shown  in  Figures  4.25-4.33.  In  all 
the  plots,  the  covariance  matrix  used  is  given  by 


E  = 


1  0.5 

0.5  1 


(4.107) 


Observe  that  each  PDF  is  unimodal.  However,  the  width  and  height  of  the  peak  along  with  the 


Table  4.2:  SIRVs  obtained  from  the  marginal  envelope  PDF 


Marginal  PDF" 


W-*P) 

ft  “  (  1:1  ) 

A  =  TfaW' 


Chi 


irn 


B  =  6  V 
1 


A  =  0(7* 

a  =  EL, <-!)"«'>»#(  *)i^¥i 


Weibull 


b  <2 


E^*^^p(-W 


Gen.  Rayleigh 


5  = 


mu 


ft  =  E^».<-ir+'v-'2''-'ff  (  *  )  rff+ffib 


Q  <  2 


ST-EZi'  (  \  1 


Rician 


(i-ps) 


6  =  gL.  (  k  )  ^ 


Table  4.3:  SIRVa  obtained  from  the  marginal  characteristic  function 


Marginal  PDF 

h^(p)  

Gaussian 

exp(-\) 

Laplace 

Emmmtammi 

Cauchy 

■  III!  Il  l  1  !■■■■ 

KftSRESuHHHHI 

Redistribution 

■Ml  I  |YT~iTT3‘— — 

Student-t 
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behavior  of  the  extreme  values  (i.e.  the  tails)  differ  significantly. 

4.5  Significance  of  the  Quadratic  form  of  the  SIRV  PDF 

Thus  far,  our  discussion  has  focused  on  techniques  that  can  be  used  to  obtain  the  PDF  of  an 
SIRV  starting  from  either  the  first  order  PDF  or  the  first  order  characteristic  function.  Given 
random  data,  we  are  also  interested  in  the  problem  of  approximating  the  distribution  of  the 
underlying  data.  The  problem  of  multivariate  distribution  identification  is  of  interest  in  radar 
signal  detection.  Since  the  background  clutter  is  not  known  a  priori,  there  is  a  need  to  identify  the 
underlying  clutter  PDF  based  on  measurements  obtained  from  a  given  environment.  Since  the 
radar  processes  N  pulses  at  a  time,  knowledge  of  the  joint  PDF  of  the  N  samples  is  necessary  in 
order  to  obtain  the  optimal  radar  signal  processor  for  the  given  clutter  background.  We  present 
an  important  theorem  here  which  enables  us  to  address  the  distribution  approximation  of  an 
SIRV. 

Theorem  4  The  PDF  of  the  quadratic  form  appearing  in  eq  (3.15)  is  given  by 

fp(p)  =  (0<P<  oo).  (4.108) 

l\j) 

Proof:  First,  we  consider  a  spherically  symmetric  random  vector  (SSRV)  X  =  [Xx ,  X2,  . . . ,  X^]T. 
Because  an  SSRV  is  a  special  case  of  the  SIRV,  the  representation  theorem  can  be  used  to  express 
X  as 

X  =  Z  S  (4.109) 

where  Z  is  a  Gaussian  random  vector  having  zero  mean  and  identity  covariance  matrix  and  S  is 
a  non-negative  random  variable  with  PDF  fs(s)-  Consider  the  random  variable 

P'  =  XTX.  (4.110) 

Using  eq  (4.109)  in  eq  (4.110)  gives 

p'  =  ZTZS\  (4.111) 

Since  ZrZ  =  YaL\  Is  the  sum  of  the  squares  of  independent  identically  distributed  Gaussian 
random  variables  having  zero  mean  and  unit  variance,  the  PDF  of  V  =  ZTZ  is  a  Chi  square 
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Figure  4.25:  Gaussian  distribution,  zero  mean,  unit  variance 
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Figure  4.26:  Laplace  Distribution,  b 
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Figure  4 ,27:  Cauchy  Distribution,  6=1 
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Figure  4.30:  Chi-distribution,  b  =  1,  v  ss  1 
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eibull  distribution,  a 
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distribution  with  N  degrees  of  freedom.  Consequently, 

fv(v)  -  ;t>>0.  (4.112) 

Noting  that  P'  —  VS?,  it  follows  that 

fp'\s(p'\*)  =  <4-113) 

Prom  the  theorem  of  total  probability,  we  have 

Mp,) ~ r  w^f)s'N‘xp(-h}fs(3)ds-  (4u4> 

Recall  from  Theorem  2  that 

Mp)  =  JQ  s-fl,exp{-~-i)fs(s)ds.  (4.115) 

Consequently,  the  PDF  of  P'  is  expressed  as 

Mp)  =  (4-no) 

Recall  that  an  SIRV  Y  =  [Vi,  Yj,  . . . ,  Yn]t  having  a  mean  vector  b  and  covariance  matrix  E  is 
related  to  the  SSRV  X  by  the  linear  transformation 

Y  =  AX  +  b  (4.117) 

where  E  =  AAr.  Observe  that 

P  =  (Y  -  b)TE“l(Y  -  b) 

=  (AX)(AAt)-1AX  (4-118) 

=  XTX. 
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Since  P  =  P'}  the  PDF  of  the  quadratic  form  P  which  is  associated  with  Y  is 


fp(p)  ~ 


(4.119) 


This  establishes  the  iucorem.  Thus,  an  SIRV  is  uniquely  characterized  by  the  quadratic  form  ap¬ 
pearing  in  its  PDF.  Knowledge  of  the  quadratic  form  PDF  is  sufficient  to  identify  the  SIRV  PDF. 
This  is  an  important  result  since  it  allows  us  to  reduce  the  multivariate  distribution  identifica¬ 
tion  problem  to  the  equivalent  problem  of  univariate  distribution  identification  of  the  quadratic 
form.  We  point  out  that  the  invariance  of  the  distribution  of  the  quadratic  form,  regardless  of 
whether  we  are  dealing  with  an  SIRV  or  an  SSRV,  arises  from  the  fact  that  the  random  vector 
is  distributed  over  an  N  dimensional  hypersphere  of  radius  R.  The  radius  of  the  hypersphere 
remains  unchanged  regardless  of  whether  we  consider  an  SIRV  or  an  SSRV.  Only  the  azimuthal 
angles  and  radial  angle  change  depending  on  whether  the  random  vector  is  a  SSRV  or  an  SIRV. 
In  context  of  the  radar  problem,  we  are  dealing  with  N  complex  samples  or  2 N  quadrature 
components.  The  results  presented  in  this  section  are  applicable  when  N  is  replaced  by  2 N. 

4.6  Conclusion 

In  this  chapter  we  have  pointed  out  a  method  to  obtain  the  PDF  of  correlated  non-Gaussian 
random  vectors  arising  in  the  problem  of  radar  clutter  modeling.  T  he  theory  of  SIRPs  has  been 
used  to  develop  the  multivariate  PDFs.  Various  techniques  have  been  presented  to  obtain  SIRV 
PDFs.  Several  examples  are  provided  to  illustrate  these  techniques.  Finally,  we  have  obtained 
the  PDF  of  the  quadratic  form  of  a  SIRV  and  we  have  shown  that  this  PDF  remains  unchanged 
regardless  of  whether  we  are  dealing  with  an  SSRV  or  an  SIRV.  We  have  also  established  that  the 
quadratic  form  contains  all  the  information  that  is  required  in  order  to  identify  the  SIRV  PDF.  As 
a  consequence  of  this  result,  the  problem  of  an  SIRV  (multivariate)  distribution  identification  has 
been  reduced  to  the  equivalent  identification  of  the  univariate  distribution  of  the  non-negative 
quadratic  form. 
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Chapter  5 

Computer  Generation  of  Simulated 
Radar  Clutter  Characterized  as  SIRPs 


5.1  Introduction 

This  investigation  is  motivated  by  a  desire  to  simulate  correlated  non-Gaussian  radar  clutter. 
Various  investigators  have  reported  experimental  results  where  non-Gaussian  marginal  proba 
bility  density  functions  (PDF)  have  been  used  to  model  the  clutter.  Usually,  radars  process  N 
samples  at  a  time.  Statistical  characterization  of  the  clutter  requires  the  specification  of  the 
joint  PDF  of  the  N  samples.  In  addition,  the  clutter  may  be  highly  correlated.  Hence,  the  joint 
PDF  must  take  into  account  the  correlation  between  samples.  Statistical  characterization  of  the 
clutter  is  necessary  if  an  optimal  radar  signal  processor  is  to  be  obtained.  For  use  of  the  well 
known  likelihood  ratio  test,  it  is  necessary  to  have  closed  form  expressions  for  the  joint  PDF  of 
the  N  clutter  samples  in  order  to  obtain  the  optimal  radar  3ignal  processor.  In  most  cases,  it  is 
difficult  to  evaluate  the  performance  of  the  optimal  radar  signal  processor  analytically  when  the 
clutter  samples  are  correlated  and  non-Gaussian.  Then  computer  simulation  may  be  necessary. 
Therefore,  there  is  a  need  to  develop  efficient  procedures  that  facilitate  computer  simulation 
of  the  clutter.  A  library  of  multivariate  non-Gaussian  PDFs  has  been  developed  in  Chapter 
4,  using  the  theory  of  Spherically  Invariant  Random  Processes  (SIRP)  and  Spherically 
Invariant  Random  Vectors  (SIRV).  In  view  of  the  large  number  of  parameters  that  are  free 
to  be  specified,  the  library  of  multivariate  non-Gaussian  PDFs  can  be  used  to  approximate  many 
different  radar  clutter  scenarios.  In  this  chapter  we  concern  ourselves  with  the  development  of 
computer  simulation  procedures  for  the  library  of  non-Gaussian  PDFs  obtained  in  Chapter  4 
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so  that  the  performance  of  any  radar  signal  processor  can  be  evaluated  for  a  variety  of  differ¬ 
ent  clutter  scenarios.  Another  issue  addressed  in  this  chapter  is  performance  assessment  of  the 
simulation  procedures.  It  has  been  pointed  out  in  Chapter  4  that  the  quadratic  form  appearing 
in  the  PDF  of  the  SIRV  contains  all  the  information  necessary  to  identify  the  PDF  of  the  un¬ 
derlying  SIRV.  We  make  use  of  this  result  in  order  to  assess  the  performance  of  the  simulation 
procedures.  Some  interesting  simulation  techniques  have  been  proposed  for  SIRVs  in  [31]  and 
[33],  The  technique  suggested  in  [31]  makes  use  of  Meijer’s-G  functions.  These  functions  are 
generalizations  of  Hypergeometric  functions  which  do  not  lend  themselves  to  the  development  of 
simple  and  elegant  simulation  procedures.  The  technique  suggested  in  [33]  requires  transforma¬ 
tions  from  rectangular  to  spherical  co-ordinates  and  then  back  again.  Secondly,  this  simulation 
procedure  involves  the  use  of  the  inverse  distribution  function  approach  for  a  rather  complicated 
distribution  function.  The  approach  developed  in  this  chapter  is  simpler  to  implement  than  those 
proposed  in  [31]  and  [33].  In  addition,  a  new  approach  is  proposed  for  assessing  the  effectiveness 
of  the  simulation  procedure. 

In  Section  5.2,  we  review  some  definitions  and  background  information  pertaining  to  the  theory 
of  spherically  invariant  random  processes.  Section  5.3  presents  two  canonical  simulation  proce¬ 
dures  for  generating  SIRVs.  Performance  assessment  of  the  simulation  procedures  is  discussed 
iii  Section  5.4.  Finally,  conclusions  are  presented  in  Section  5.5. 

5.2  Preliminaries 

We  begin  by  restating  the  definitions  for  spherically  invariant  random  vector  and  spherically 
invariant  random  processes.  A  spherically  invariant  random  vector  (SIRV)  is  a  random  vector 
(real  or  complex)  whose  PDF  is  uniquely  determined  by  the  specification  of  a  mean  vector,  a 
covariance  matrix  and  a  characteristic  first  order  PDF.  Equivalently,  the  PDF  of  an  SIRV  can 
also  be  referred  to  as  an  elliptically  contoured  distribution.  A  spherically  invariant  random 
process  (SIRP)  is  a  random  process  (real  or  complex)  such  that  every  random  vector  obtained 
by  sampling  this  process  is  an  SIRV.  The  work  of  Yao  [28]  gave  rise  to  a  representation  theorem 
which  can  be  stated  as  follows  (see  Theorem  1): 

If  a  random  vector  is  a  SIRV,  then  there  exists  a  non-negative  random  variable  S  such  that 
the  PDF  of  the  random  vector  conditioned  on  S  is  a  multivariate  Gaussian  PDF. 

We  consider  the  product  given  by  X  =  ZS,  where  X  =  [Xx . .  .Xn]t  denotes  the  SIRV, 
Z  =  [.£/j . . .  Zn)t  is  a  Gaussian  random  vector  with  zero  mean  and  covariance  matrix  M  and 
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5  is  a  non-negative  random  variable  with  PDF  /$(«).  Since  it  is  desirable  to  independently 
control  the  correlation  properties  and  the  non-Gaussian  envelope  PDF,  Z  and  S  are  assumed  to 
be  statistically  independent.  The  PDF  of  X  conditioned  on  S  is  (see  eq  (3.14)) 

/X|s(x|s)  =  (2?r)"^|M|-is_Jvex,p(-~j)  (5.1) 

where  p  is  a  non-negative  quadratic  form  given  by  p  =  xTM-1x  and  |M|  denotes  the  determinant 
of  the  covariance  matrix  M.  The  PDF  of  X  is  given  by  (see  eqs  (3.15)  and  (3.16)) 

/x(x)  =  (2*)-*|M|-IMp)  (5.2) 

where 

Mp)  =  JQ  9-Nexp(—£s)fs(s)ds.  (5.3) 

The  PDF  of  the  random  variable  S  is  called  the  characteristic  PDF  of  the  SIRV.  Therefore,  it  is 
apparent  that  the  PDF  of  a  SIRV  is  completely  determined  by  the  specification  of  a  mean  vector, 
a  covariance  matrix  and  a  characteristic  first  order  PDF.  In  addition,  the  PDF  of  the  SIRV  is  a 
function  of  a  non-negative  quadratic  form.  However,  unlike  the  Gaussian  case,  dependence  on 
the  quadratic  form  is  more  complicated  than  the  simple  exponential.  Therefore,  an  SIRP  can 
be  regarded  as  a  generalization  of  the  familiar  Gaussian  random  process.  We  point  out  that  the 
covariance  matrix  of  the  SIRV  is  given  by  E  =  ME(S2)  where  E(S2)  is  the  mean  square  value  of 
the  random  variable  S.  It  is  seen  that  the  covariance  matrix  of  the  SIRV  normalized  by  the  mean 
square  value  of  S  is  the  covariance  matrix  of  the  Gaussian  random  vector.  Note  that  it  is  possible 
to  set  the  covariance  matrix  of  the  SIRV  equal  to  that  of  the  Gaussian  random  vector  by  requiring 
that  E(S2)  be  equal  to  unity.  The  desired  non-Gaussian  PDF  can  be  obtained  by  choosing  fs{s) 
appropriately.  Thus,  it  is  seen  that  the  SIRV  formulation  for  radar  clutter  modeling  affords 
independent  control  over  the  non-Gaussian  PDF  of  the  clutter  and  its  correlation  properties. 
Several  techniques  are  available  in  Chapter  4  for  obtaining  h^(p).  Note  that  the  Gaussian 
random  vector  is  a  special  case  of  an  SIRV  and  is  obtained  when  fs(s)  —  6(s  —  1)  where  S(t) 
is  the  unit  impulse  function.  An  interesting  interpretation  of  the  representation  theorem  is  that 
every  SIRV  is  the  modulation  of  a  Gaussian  random  vector  by  a  non-negative  random  variable. 

Many  of  the  attractive  properties  of  Gaussian  random  vectors  also  apply  to  SIRVs.  The  most 
relevant  property  of  SIRVs  for  the  purpose  of  computer  simulation  is  the  closure  property  under 
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linear  transformation  [28]  stated  below  (see  Theorem  2): 

//X  is  an  SIRV  with  characteristic  PDF  fs(s),  then 

Y  =  AX  +  b  (5.4) 

is  also  an  SIRV  with  the  same  characteristic  PDF.  It  is  assumed  that  AAT  is  a  nonsingular 
matrix  and  b  is  a  known  vector  having  the  same  dimension  as  X. 

Theorem  2  provides  us  with  a  powerful  technique  for  simulating  SIRVa.  A  white  SIRV  is 
defined  as  one  that  has  a  diagonal  covariance  matrix.  In  other  words,  the  components  of  the 
white  SIRV  are  uncorrelated  but  not  necessarily  independent.  We  can  start  with  a  zero  mean 
white  SIRV  X  having  identity  covariance  matrix  and  perform  the  linear  transformation  given  by 
eq  (5.4)  to  obtain  an  SIRV  Y  having  a  non-zero  mean  and  desired  covariance  matrix  E.  The 
matrix  A  and  the  vector  b  are  given  by 


A  =  ED* 

b=Hy 


(5.5) 


where  E  is  the  matrix  of  normalized  eigen  vectors  of  the  covariance  matrix  E,  D  is  the  diagonal 
matrix  of  eigen  values  of  E  and  py  is  the  desired  non-zero  mean  vector. 

In  many  instances  it  is  not  possible  to  obtain  fs(s)  for  an  SIRV  in  closed  form,  even  though 
its  existence  is  guaranteed.  In  such  cases,  an  alternate  approach  must  be  used  in  order  to 
characterize  the  SIRV.  The  following  theorem  can  be  used  to  completely  characterize  a  white 
SIRV  having  zero  mean  and  identity  covariance  matrix  (see  Theorem  3): 

A  random  vector  X  =  [Xi . . .  Xn}t  is  a  zero  mean  white  SIRV  having  identity  covariance 
matrix  if  and  only  if  there  exist  random  variables  R  €  (0,oo),  0  €  (0, 2ic)  and  €  (0,  it), 
(k  =  1, . . .  N  —  2)  such  that  when  the  components  ofX.  are  expressed  in  the  generalized  spherical 
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coordinates 


X ,  =  R  <»■(•,) 

X*  -  Rcos(*fc) nfc1  sin(*#)  (1  <  k  <  N  -  2) 

X/v-i  =  Rcos(0)  n|l72  sin(0.)  (5-6) 

Xn  =  R  8in(0)  n^Li*  8in($,) 

then  the  random  variables  R,  0  and  $*  are  mutually  and  statistically  independent  having  PDFs 
of  the  form 

Mr)  =  ^~^Mra)u(r) 

UMk)  -  -  u(<f>k  -  tt)]  ^  ^ 

h{0)  =  {2k)~x[u(9)  -  u(0  -  27 r)] 


where  T(i/)  is  the  Eulero  Gamma  function  and  u(t)  is  the  unit  step  function. 

As  a  consequence  of  Theorem  3,  any  SIRV  with  zero  mean  and  identity  covariance  matrix 
can  be  represented  in  generalized  spherical  coordinates  which  are  mutually  and  statistically 
independent  regardless  of  the  SIRV  considered.  Also,  note  that  the  PDFs  of  0  and  (k  = 
1,. . .  N  —  2)  are  functionally  independent  of  the  white  SIRV  considered.  Only  the  PDF  of  R 
changes  from  one  white  SIRV  to  another.  Note  that  R 2  =  —  XTX.  Hence  R  is  the 

norm  of  the  SIRV. 

Another  important  feature  of  the  SIRV  is  that  the  quadratic  form  appearing  in  its  PDF  contains 
all  the  information  necessary  to  identify  the  PDF.  It  follows  that  knowledge  of  the  PDF  of  the 
quadratic  form  of  the  SIRV  is  sufficient  to  identify  the  PDF  of  the  corresponding  SIRV  [34]  (see 
Theorem  4): 

The  PDF  of  the  quadratic  form  appearing  in  eq  (5.2)  is  given  by 

Mp)  =  jfLtft-'hMip)  (0  <  p  <  oo)  (5.8) 

and  remains  unchanged  regardless  of  whether  or  not  the  SIRV  is  white. 


The  theorems  reviewed  in  this  section  will  be  made  use  of  in  the  proposed  simulation  approach, 
discussed  in  Section  5.3,  and  in  assessing  the  performance  of  the  simulation  procedure,  discussed 
in  Section  5.4. 

In  the  context  of  the  problem  of  radar  clutter  modeling  and  simulation,  the  bandpass  process 
Y(t)  —  Re\Y (t)exp(ju>oi)\  can  be  expressed  in  terms  of  the  equivalent  complex,  wide  sense 
stationary  random  processes  Y(t).  More  precisely,  we  obtain  N  complex  samples  by  sampling 
the  complex  random  process  Y(t)  =:  Yc(t)  +  jY„{t),  where  the  subscripts  c  and  a  denote  the  in 
phase  and  out  of  phase  quadrature  components.  This  is  equivalent  to  working  with  a  real  vector 
of  2N  quadrature  components  which  is  the  approach  taken  in  this  chapter.  Therefore,  the  results 
presented  in  this  section  are  applied  to  the  problem  of  radar  clutter  modeling  with  N  replaced 
by  2 N.  For  ease  of  reference,  the  library  of  non-Gaussian  SIRV  PDFs  obtained  in  Chapter  4  is 
repeated  here.  However  A2/v(p)  for  those  SIRVs  for  which  the  characteristic  PDF  is  known  are 
listed  in  Table  5.1.  The  corresponding  characteristic  PDFs  are  listed  in  Table  5.2.  Table  5.3  lists 
A2 n(p)  for  those  SIRVs  whose  characteristic  PDF  is  unknown. 

5.3  Two  Canonical  Simulation  Procedures  for  Generating  SIRVs 

In  this  section,  we  concern  ourselves  with  two  simulation  procedures  for  generating  the  SIRVs 
listed  in  Table  5.1  and  Table  5.2.  The  first  simulation  procedure  to  be  discussed  is  applicable 
when  the  characteristic  PDF,  /5(a),  is  known.  For  each  of  the  PDFs  listed  in  Table  5.1,  the 
characteristic  PDF  /s(s)  is  tabulated  in  Table  5.3,  where  E(S2)  =  1.  Since  the  representation 
theorem  results  in  the  covariance  matrix  of  the  SIRV  being  given  by  E  =  M E(S2),  the  choice 
of  E(S2)  =  1  makes  E  identical  to  M,  the  covariance  matrix  of  the  Gaussian  random  vector 
Z.  However,  as  listed  in  Table  5.4,  the  PDFs  commonly  encountered  in  statistical  tables  do  not 
have  unit  mean  square  value.  In  order  to  obtain  the  random  variable  S,  having  unit  mean  square 
value  and  the  corresponding  PDF  /s(s),  we  generate  the  random  variable  V  having  PDF  fv(v) 
and  mean  square  value  E(V2)  —  a 2,  and  perform  the  linear  transformation  S  =  —  to  obtain  the 
desired  S.  In  Table  5.1,  and  Table  5.4,  the  scale  parameter  6,  as  well  as  the  shape  parameter  v 
are  identical  in  both  cases  and  u(v )  denotes  the  unit  step  function.  The  simulation  procedure 

for  these  SIRV  PDFs  is  fairly  simple  and  is  stated  below: 

5.3.1  Simulation  Procedure  for  SIRVs  with  Known  Characteristic  PDF 

(1)  Generate  a  white  zero  mean  Gaussian  random  vector  Z,  having 
identity  covariance  matrix. 


115 


(2)  Then  generate  a  random  variable  V  from  the  PDF  fviy).  Denote 
the  mean  square  value  of  V  by  a3. 

(3)  Normalize  the  random  variable  V  by  a  to  obtain  the  modulating 
random  variable  S.  In  other  words  generate  5  =  ~. 

(4)  Generate  the  product  given  by  X  =  Z S.  At  this  step,  we  have  a 
white  SIEV  having  zero  mean  and  identity  covariance  matrix. 

(5)  Finally,  perform  the  linear  transformation  given  by  eq  (5.5)  to 
obtain  the  SIRV  Y  with  desired  mean  and  covariance  matrix. 

Fig  5.1  shows  the  simulation  procedure  presented  above. 

The  subroutine  RNNOR  in  IMSL  was  used  for  generating  the  Gaussian  random  vector  2  . 
Interestingly  enough,  the  PDFs  listed  in  Table  4.4  can  be  related  to  the  PDF  of  the  Gamma 
distribution  as  discussed  below.  The  PDF  fv(v)  for  the  K-distributed  SIRV  is  a  Chi  PDF.  We  first 
address  the  random  variable  generation  for  the  Chi  PDF  and  then  provide  the  transformations 
for  obtaining  the  random  variables  for  the  other  PDFs  listed  in  Table  4.4. 

Consider  the  standard  Gamma  distribution  given  by 

M)  =  1  >  0  (5.9) 

where  a  denotes  the  shape  parameter  and  T(q)  is  the  Eulero-  Gamma  function.  The  random 
variable  T  is  readily  generated  by  using  the  IMSL  subroutine  RNGAM.  The  procedure  for  gen¬ 
erating  the  Chi  distributed  random  variable  V  needed  for  the  K-distributed  SIRV  is  summarized 
below. 

1.  Generate  the  random  variable  T  from  the  standard  Gamma  distribution 
of  eq  (5.9)  by  using  the  IMSL  subroutine  RNGAM. 

2.  Perform  the  transformation  V  = 

The  PDF  fv(v)  for  the  Laplace  SIRV  is  a  Rayleigh  PDF  and  is  obtained  from  fv(v)  of  the 
K-distributed  SIRV  by  letting  a  =  1.  The  random  variable  V  for  the  PDF  fv(v)  listed  in 
Table  4.4  for  the  Student-t  SIRV  is  obtained  from  the  standard  Gamma  PDF  of  eq  (5.9)  by  the 
transformation  V  =  and  letting  a  =  v.  Finally,  the  PDF  fv(v)  for  the  Cauchy  SIRV  is 


Table  5.1:  ftaAr(p)  for  SIRVb  with  Known  Characteristic  PDF 


Laplace 


Cauchy 


((•distribution 


Student-t 


£m±EL 


IrP 


jtT(«>+/vT~ 

ryt7J i+ZWffr 


Table  5.2:  ftaAf(p)  for  SIRVs  with  Unknown  Characteristic  PDFs 


Marginal  PDF 

htfip) 

Chi 

(-2)V_1d 

c>  =  (  *-t) 

B  =  6  V 
v  <  1 

Weibull 

D*=i  Ckp^~Nezp(-Api) 

A  —  a  erl 

=  •r+"2"£  (  *  )  rxrS^o 

6  <  2 

Gen.  Rayleigh 

LkJi1  Dkp^-^exp(-Bpi) 

a  -  <T*a 

A  “  /5*r7|7 

B  -  p-°<Ta 

O.  =  £jU,(-ir+"-,a''-1£  (  *  )  rffif&v 

a  <  2 

Rician 

( Ni 1 )  (-•)‘(f)‘<*«p(-'i) 

&  -  Z)m=o  (  m  J  h-im(pA),  A  — 

obtained  from  fv(v)  of  the  Student-t  SIRV  by  letting  v  =  1.  The  procedure  for  generating  the 
random  variable  V  needed  for  the  Student-t  SIRV  is  summarized  below. 

1.  Generate  the  random  variable  T  from  the  standard  Gamma  distribution 
of  eq  (5.9)  by  using  the  IMSL  subroutine  RNGA.M. 

2.  Perform  the  transformation  V  — 

5.3.2  Simulation  Scheme  for  SIRVs  with  Unknown  Characteristic  PDF 

We  now  concern  ourselves  with  the  second  simulation  procedure  which  is  applicable  when 
the  characteristic  PDF  is  unknown,  as  is  the  case  for  SIRVs  listed  in  Table  4.2.  The  alternate 
approach  makes  use  of  Theorem  3  and  the  representation  theorem.  As  pointed  out  previously, 
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5.3:  Characteristic  PDF  for  SIRVs  listed  in  Table  4.1  [^(5*) 

Laplace 

Cauchy 

a3b*t~9exp{-2&r)u{») 

K-distribution 

Student-t 

>*•<«) 

Table  5.4:  Related  PDF  fy(v) 


Marginal  PDF 

Laplace 

_4_  I 

Cauchy 

X _ 

K-distribution 

rr2fe-(bv)3a-1exp(-iZf-)u(v) 

K9H 

Student-t 

T^b3‘,-iv-^1'>exp{~^,)u(v) 

ecqH 

V 


Figure  5.1:  Simulation  Scheme  for  SIRVs  with  Known  Characteristic  PDF 
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the  PDFs  of  0  and  (k  ~  1, 2, . . . ,  N  —  2)  are  independent  of  the  white  SIRV  being  considered. 
Only  the  PDF  of  R  changes  from  one  white  SIRV  to  another.  As  a  result,  the  second  simulation 
procedure  requires  the  capability  to  generate  the  random  variable  R  whose  PDF  is  given  by 
eq  (5.7).  Since  the  Gaussian  random  vector  belongs  to  the  family  of  SIRVs,  a  zero  mean  white 
Gaussian  random  vector  Z  with  identity  covariance  matrix  admits  a  representation  of  the  form 
of  eq(5.6).  Let  Rq  denote  the  norm  of  the  white  Gaussian  random  vector.  The  simulation 
procedure  is  stated  below: 

(1)  Generate  a  white,  zero  mean  Gaussian  random  vector  Z  having 
identity  covariance  matrix. 

(2)  Compute  the  norm  Rg  =  ||Z||  =  \/ZTZ  of  the  white  Gaussian 
random  vector. 

(3)  Generate  the  norm  R  —  ||X||  =  V/XTX  of  the  white  SIRV  from 
the  PDF  of  R  given  by  eq  (5.7). 

(4)  Generate  the  white  SIRV  X  by  taking  the  product  X  =  Z^. 

(5)  Finally,  perform  the  linear  transformation  given  by  eq  (5.5)  to 
obtain  the  SIRV  Y  with  desired  mean  and  covariance  matrix. 

The  simulation  procedure  is  shown  schematically  in  Fig  5.2. 

Note  that  this  simulation  procedure  avoids  the  explicit  generation  of  the  variables  0  and 
(A:  =  1, . . .  N  —  2).  The  generation  procedure  for  a  white  Gaussian  random  vector  is 
well  known.  Therefore,  we  need  to  concern  ourselves  only  with  the  development  of  a  suitable 
generation  scheme  for  the  norm  R  of  the  white  SIRV  X.  Generation  of  the  norm  R  is  not  trivial. 
This  is  due  to  the  fact  that  the  PDF  of  R  is  usually  not  in  a  simple  functional  form.  Consequently, 
it  may  not  be  possible  to  conveniently  evaluate  analytically  the  distribution  function  and  its 
inverse.  As  a  result,  generation  methods  based  on  the  inverse  distribution  function  do  not  offer  a 
practical  solution  to  this  problem.  Therefore,  in  this  chapter  we  generate  R  by  making  use  of  the 
approach  called  the  ‘Rejection  Method’.  The  rejection  method  can  be  used  to  generate  random 
variables  whose  cumulative  distribution  functions  are  not  known,  but  whose  PDFs  are  known 
explicitly  [49].  The  rejection  procedure  assumes  knowledge  of  the  maximum  value  of  the  PDF 
of  R  for  a  given  SIRV  PDF  and  a  finite  estimate  to  the  range  of  the  PDF  of  R  so  that  the  area 
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R 


Figure  5.2:  Simulation  Scheme  for  SIRVs  with  Unknown  Characteristic  PDF 

under  the  PDF  curve  is  close  to  unity.  These  quantities  are  denoted  by  c  and  6,  respectively.  We 
discuss  the  rejection  procedure  in  detail  in  Appendix  B.  The  Rejection  method  is  summarized 
below: 

(1)  Generate  a  uniform  random  variate  L\  on  the  interval  (0,6). 

(2)  Generate  another  uniform  variate  U%  on  the  interval  (0,  c). 

(3)  If  Ui  <  then  R  =  U\.  Otherwise,  reject  U\  and  return  to 

step  1. 

Note  that  the  simulation  procedures  of  Fig  5.1  and  Fig  5.2  are  canonical  in  the  sense  that 
their  forms  remain  unchanged  from  the  simulation  of  one  SIRV  to  another.  Even  though,  the 
scheme  of  Fig  5.2  can  be  used  even  when  fs{»)  is  known,  the  scheme  of  Fig  5.1  is  preferred  when 
S  can  be  generated  easily.  The  linear  transformation  of  eq  (5.5)  is  a  filtering  operation.  In  both 
schemes,  pre-modulation  filtering  is  equivalent  to  post-modulation  filtering.  This  results  from 
the  fact  that  the  representation  theorem  is  valid  whether  or  not  the  SIRV  X  and  the  Gaussian 
random  vector  Z  are  white. 
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5.4  Performance  Assessment  of  the  Simulation  Schemes 

In  this  section  we  concern  ourselves  with  the  performance  assessment  of  the  simulation  proce¬ 
dures  developed  in  section  5.3.  We  point  out  that  the  simulation  procedures  developed  in  section 
5.3  are  exact  in  the  sense  that  they  are  derived  without  approximation  from  theory.  Hence, 
departures  from  the  exact  SIRVs  will  depend  for  the  most  part  on  the  nonideality  of  the  uniform 
random  number  generators  used.  Empirical  assessment  of  the  simulation  procedures  is  necessary 
for  practical  applications. 

One  possible  approach  for  assessing  the  distributional  properties  of  the  simulated  data  is  to 
perform  a  hypothesis  test  on  the  marginal  distributions  of  the  components  of  the  SIRV  where 
the  hypothesis  are  given  by 

//o^The  hypothesis  that  the  simulated  data  is  from  the  desired  distribution 
Hi’.The  hypothesis  that  the  simulated  data  is  not  from  the  desired  distribution. 

For  a  fixed  Type-1  error  probability  (i.e.,  the  probability  that  Hx  is  accepted  given  that  Ho  is 
true)  each  marginal  distribution  can  be  checked  by  employing  one  of  the  commonly  used  goodness 
of  fit  procedures.  Since  the  components  of  the  random  vectors  are  not  statistically  independent, 
we  are  now  confronted  with  the  problem  of  developing  a  goodness  of  fit  test  for  the  multivariate 
data.  In  general,  it  is  very  difficult  to  obtain  the  overall  significance  level  of  the  test  (i.e.,  the 
probability  that  Ho  is  accepted  given  that  H0  is  true)  for  the  multivariate  goodness  of  fit  testing 
procedure. 

However,  an  attractive  feature  of  SIRVs  is  that  the  quadratic  form  p  appearing  in  the  SIRV 
PDF  contains  all  the  information  necessary  for  identifying  the  PDF  of  the  SIRV.  In  other  words, 
knowledge  of  the  PDF  of  the  quadratic  form  is  sufficient  to  determine  the  underlying  SIRV  PDF. 
Furthermore,  the  quadratic  form  PDF  remains  unchanged  regardless  of  whether  the  SIRV  is  white 
or  colored.  The  PDF  of  the  quadratic  form  appearing  in  the  SIRV  PDF  is  given  by  eq  (5.8).  For 
the  radar  problem  where  we  deal  with  N  complex  samples  or  2N  quadrature  components,  note 
that  we  make  use  of  eq  (5.8)  with  N  replaced  by  2 N.  Hence,  we  base  our  goodness  of  fit  test 
procedure  for  the  generated  SIRVs  on  the  PDF  of  the  quadratic  form  p.  Note  that  we  have  now 
reduced  the  multivariate  problem  to  an  equivalent  univariate  problem  involving  the  goodness  of 
fit  test  for  the  PDF  of  the  quadratic  form. 

In  the  examples  presented  in  this  section,  we  generated  m  =  1000  realizations  of  the  random 
vector  Y  with  N  —  2  complex  samples  and  obtained  one  thousand  samples  of  the  quadratic  form 
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P  for  each  of  the  non-Gaussian  SIEVs  whose  PDFs  are  listed  in  Tables  5.1  and  5.3.  In  each  case, 
we  used  the  corresponding  theoretical  PDF  of  the  quadratic  form  given  by  eq  (5.8)  to  test  for 
the  distribution  of  the  generated  quadratic  form.  The  frequency  histograms  for  the  generated 
data  and  the  corresponding  theoretical  PDFs  are  shown  in  figures  5.3-5.10.  In  addition,  a  Chi- 
Square  test  was  performed  on  the  generated  data  with  the  Type-1  error  fixed  at  0.05  and  the 
null  hypothesis  was  not  rejected  in  each  case.  The  histograms  provide  a  good  idea  about  the 
true  distributions  for  large  sample  sizes.  Observe  that  the  empirical  PDFs  are  very  close  to 
the  theoretical  PDFs.  Note  that  the  procedure  used  in  this  section  to  assess  the  distributional 
assumptions  of  the  random  samples  from  the  SIEV  PDFs  is  a  formal  goodness  of  fit  test.  Similar 
procedures  have  been  proposed  to  test  for  multivariate  normality  in  [50]  and  [51]. 

5.5  Conclusions 

In  this  Chapter,  we  have  presented  two  schemes  that  can  be  used  in  practice  to  simulate 
correlated  non-gaussian  radar  clutter  when  the  clutter  can  be  modeled  as  a  spherically  invariant 
random  process.  We  pointed  out  that  the  simulation  schemes  developed  are  canonical  schemes 
and  do  not  change  form  from  the  simulation  of  one  SIEV  to  another.  A  new  approach,  based  on 
the  PDF  of  the  quadratic  form  appearing  in  the  SIEV  PDF,  was  used  to  perform  a  goodness  of  fit 
test  in  order  to  assess  performance  of  the  proposed  simulation  schemes.  Performance  assessment 
based  on  this  scheme  showed  excellent  agreement  between  the  theoretical  and  empirical  PDFs  of 
the  quadratic  form.  Finally,  it  was  pointed  out  that  use  of  this  technique  reduced  the  goodness 
of  fit  test  from  a  multivariate  testing  procedure  to  a  univariate  testing  procedure  resulting  in 
tremendous  processing  simplicity.  Therefore,  this  procedure  lends  itself  very  we))  to  practical 
applications. 
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Figure  5.3:  Theoretical  and  Empirical  Quadratic  form  PDFs  for  Laplace  SIRV 
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Figure  5.5:  Theoretical  and  Empirical  Quadratic  form  PDFs  for  K-distributcd  SHIV 
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Figure  5.10:  Theoretical  and  Empirical  Quadratic  form  PDFs  fo 
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Chapter  6 

A  New  Method  for  Univariate 
Distribution  Approximation 


6.1  Introduction 

In  this  chapter  we  address  the  problem  of  approximating  the  PDF  of  a  set  of  random  data. 
In  practice,  the  clutter  PDF  encountered  in  radar  signal  processing  is  not  known  apriori.  Con¬ 
sequently,  a  scheme  that  approximates  the  clutter  PDF  based  on  a  set  of  measured  data  is 
necessary.  Currently,  available  tests  such  as  the  Kolmogorov-Smirnov  test  and  the  Chi-Square 
test  address  the  problem  of  goodness-of-fit  for  random  data.  In  particular,  these  tests  provide 
information  about  whether  a  set  of  random  data  is  statistically  consistent  with  a  specified  dis¬ 
tribution,  to  within  a  certain  confidence  level.  However,  if  the  specified  distribution  is  rejected, 
these  tests  cannot  be  used  for  approximating  the  underlying  PDF  of  the  random  data.  Moreover, 
these  tests  require  large  sample  sizes  for  reliable  results. 

In  practice,  only  a  small  number  of  samples  may  be  available.  Therefore,  the  scheme  used 
should  be  efficient  for  small  sample  sizes.  A  new  algorithm  based  on  sample  order  statistics  has 
been  developed  in  [50]  for  univariate  distribution  identification.  This  algorithm  has  two  modes  of 
operation.  In  the  first  mode  the  algorithm  performs  a  goodness-of-fit  test.  Specifically,  the  test 
determines,  to  a  desired  confidence  lev  i,  whether  random  data  is  statistically  consistent  with  a 
specified  probability  distribution.  In  the  second  mode  of  operation  the  algorithm  approximates 
the  PDF  underlying  the  random  data.  In  particular,  by  analyzing  the  random  data  and  with¬ 
out  any  a  priori  knowledge,  the  algorithm  identifies  from  a  stored  library  of  PDFs  that  density 
function  which  best  approximates  the  data.  Estimates  of  the  scale,  location,  and  shape  param- 
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eters  of  the  PDF  are  provided  by  the  algorithm.  The  algorithm  typically  works  well  with  small 
sample  sizes  of  between  50  and  100  samples.  An  extension  of  this  algorithm  for  the  multivariate 
Gaussian  PDF  has  been  considered  in  [50]  and  [52]. 

In  this  chapter  we  describe  a  new  method  for  univariate  distribution  approximation.  In  section 
6.2  we  present  definitions.  Section  6.3  describes  the  algorithm  developed  in  [50]  for  univariate 
distribution  identification.  The  proposed  distribution  identification  algorithm  is  discussed  in 
Section  6.4.  Section  6.5  proposes  a  method  to  estimate  the  shape  parameter  based  on  the 
procedure  developed  in  Section  6.4.  Finally,  conclusions  are  presented  in  Section  6.6. 

6.2  Definitions 

Let  fy{y)  denote  the  PDF  OF  Y  which  has  been  standardized  in  a  specified  manner.  Introduce 


the  linear  transformation  defined  by 

X 

=  /3y  +  q 

(6.1) 

The  PDF  of  X  is  given  by 

fx(x)  = 

1  f  tx~a\ 

■  mf  (  p  1 

(6.2) 

where  a  and  /?  are  defined  to  be  the  location  and  scale  parameters  of  X,  respectively.  The  mean 
fix  and  variance  ax  of  the  random  variable  X  axe  given  by 


*  =  E{X) 

(6.3) 

= £|(x  -  „,n 

Although  the  mean  and  the  variance  are  related  to  the  location  and  scale  parameters,  note  that 
the  location  parameter  is  not  the  mean  value  and  the  scale  parameter  is  not  the  square  root  of 
the  variance,  in  general.  However,  for  a  standardized  Gaussian  PDF  fy(y)  for  which  the  mean  is 
zero  and  the  variance  is  unity,  the  location  parameter  is  the  mean  of  X  and  the  scale  parameter 
is  the  standard  deviation  (square  root  o  the  variance)  of  X. 

The  coefficient  of  skewness,  o3,  and  the  coefficient  of  kurtosis,  a4,  are  defined  to  be 
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It  is  readily  shown  that  013  and  <*4  are  invariant  to  the  values  of  and  <rx.  For  any  PDF  that 
is  symmetric  about  the  mean,  a 3  —  0.  For  the  case  of  the  Gaussian  distribution,  013  =  0  and 
0(4  ~  3. 

6.3  Goodness  of  Fit  Test 

In  this  section,  we  introduce  a  general  graphical  method  for  testing  whether  a  set  of  random 
data  is  statistically  consistent  with  a  specified  univariate  distribution.  The  proposed  method 
not  only  yields  a  formal  goodness-of-fit  test  but  also  provides  a  graphical  representation  that 
gives  insight  into  how  well  the  random  data  is  representative  of  the  specified  distribution  (null 
hypothesis).  Using  the  normal  distribution  as  a  reference  distribution ,  the  standardized  sample 
order  statistics  are  represented  by  a  system  of  linked  vectors.  Both  the  terminal  point  of  these 
linked  vectors  and  the  shape  of  their  trajectories  are  used  in  determining  whether  or  not  to  accept 
the  null  hypothesis. 

In  this  section  we  first  give  a  brief  description  of  the  corresponding  test  statistic  and  then 
explain  the  goodness  of  fit  test  procedure.  For  illustration  purposes,  we  assume  that  the  null 
distribution  is  Gaussian.  However,  the  proposed  procedure  works  for  any  null  hypothesis. 

Let  Xk\  k  —  1,2, ...n  denote  the  kth  sample  from  a  Gaussian  distribution  with  mean  p  and 

variance  cr2.  We  define  _ 

Yk  =  ^~  k  =  1,2, .. .  ,n  (6.5) 

where  X  =  T.Xk/n  is  the  sample  mean  and  S  =  {£(AT;  -  3f)2/(n  -  1)}1/2  j8  the  sample  standard 
deviation.  The  standardized  order  statistics  are  denoted  by  Yi:n  i  =  1,2, ...  n  and  are  obtained 
by  ordering  the  V*;  k.  =  1,2, ...n  such  that  Yv.n  <  Y2;n  <  ...  <  Yn;n.  The  ith  linked  vector 
is  characterized  by  its  length  and  orientation  with  respect  to  the  horizontal  axis.  Let  X\,n  < 
Xi-.n  <  ...  <  Xn;n  denote  the  ordered  samples  obtained  by  ordering  X *;  k  =  1,2, ...n.  Let 
m1;n,  m2:n,  •  •  • ,  mn;Tl  denote  the  expected  values  of  the  standard  normal  order  statistics,  where 
m,:,i  =  The  length  of  the  ith  vector  a,  is  obtained  from  the  absolute  value  of  the 

itfl  standardized  sample  order  statistic  Yi:n,  while  its  orientation  0,  is  related  to  m,:n.  More 
specifically,  by  definition, 

a.  _  llj;nl 
a‘  n 

(6.6) 

0,  =  7r$(m<:n) 

where  $(x)  =  (\Z27r)-1  exp(— j)dt  is  the  distribution  function  of  the  standard  Gaussian 
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distribution.  We  define  the  sample  points  Qk  in  a  two  dimensional  plane  by 


Qk  =  (Uh,  Vk)  A:  =  (6.7) 

where  Uo  —  Vo  —  0  and 

a-lSf.,{«»(»<)}|K.|  («■«) 

k  =  1,2,.  ..n. 

The  sample  linked  vectors  are  obtained  by  joining  the  points  Q *.  Note  that  Qo  —  (0, 0).  It  should 
also  be  noted  that  the  statistic  Qn  given  in  eq  (6.7)  represents  the  terminal  point  of  the  linked 
vectors  defined  above.  Figure  6.1  shows  the  linked  vectors  obtained  for  the  Gaussian  distribution 
with  n  =  6.  The  null  distribution  was  obtained  by  averaging  the  results  for  50,000  Monte  Carlo 
trials.  The  solid  curve  in  Figure  5.1  shows  the  linked  vectors  for  the  sample  distribution  while 
the  dashed  curve  shows  the  linked  vector  for  the  null  distribution.  The  magnitude  and  angles  of 
the  linked  vectors  are  obtained  from  eq  (6.6).  Note  that  the  angles  are  independent  of  the  data 
and  depend  only  on  the  sample  size  n.  Only  the  magnitudes  of  the  linked  vectors  are  dependent 
on  the  samples  drawn  and  change  from  one  trial  to  another. 

For  a  typical  set  of  ordered  samples  (i.e.,  ordered  samples  drawn  from  the  null  distribution) 
it  is  reasonable  to  expect  that  the  sample  linked  vectors  would  closely  follow  the  null  pattern. 
If  the  ordered  set  of  samples  is  not  from  the  null  distribution,  the  sample  linked  vectors  are  not 
expected  to  closely  follow  the  null  pattern.  Hence,  the  procedure  provides  visual  information 
about  how  well  the  ordered  set  of  samples  fit  the  null  distribution. 

An  important  property  of  the  Qn  statistic  is  that  it  is  invariant  under  linear  transformation. 
In  particular,  we  consider  the  standardization  used  in  eq  (6.5).  Let  Zn  —  aXi  -1-  b ,  where  a  and 
b  are  known  constants.  Let  S'  denote  the  sample  standard  deviation  of  the  samples  Z,-.  Then, 

it,  is  readily  shown  that  . •••'  =  The  invariance  property  follows  as  a  consequence.  The 

advantage  of  this  property  is  that  the  PDF  of  Qn  =  (f/n,  Vn)  depends  only  on  the  sample  size 
n  and  is  unaffected  by  the  location  and  scale  parameters.  Since  it  is  difficult  to  determine  the 
joint  PDF  of  Un  and  Vn  analytically,  it  is  necessary  to  obtain  empirical  results. 

Assuming  that  the  conditions  under  the  central  limit,  theorem  are  satisfied,  the  marginal  PDFs 
of  Un  and  Vn  can  be  approximated  as  Gaussian,  in  the  limit  of  large  n.  In  addition,  it  is  assumed 
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that,  the  joint  PDF  of  Un  and  Vn  is  approximately  bivariate  Gaussian.  Consequently,  all  that  is 
needed  to  determine  the  bivariate  PDF  is  the  specification  of  E(Un),  U(K,),  E(UnVn)y  Var(Un) 
and  Var(Vn).  Drawing  samples  from  the  Gaussian  distribution,  it  has  been  shown  empirically 
in  [50]  that  for  3  <  n  <  100 


E(Un)  =  0 

E(Vn)  =  w  0.326601  +  2^12221 
E(UnVn)=  0 

Var(Un )  =  a*  w  2*1122  +  mm 
Var{Vn)  =  <tI&  2i— 


(6.9) 


„a  **  0.04427  _  0.0951 

n*  ’ 


Since  Un  and  Vn  are  approximately  bivariate  Gaussian  for  large  or  moderate  sample  sizes,  their 
joint  PDF  can  be  written  as 


fun,vn(un ,  vn)  =  (27r)~1(<rucrt/)~1exp(—-) 


where 


Let  t  —  to-  Then  the  equation 


,  _  K  ,  K  “  Pv? 

*  —  o'  o 


<o  =  Ln  +  K--^)5 


*2 


is  that  of  an  ellipse  in  the  un,vn  plane  for  which 


Svn.vAUn,  »„)  =  (2 


(6.10) 

(6.11) 

(6.12) 

(6.13) 


Points  that  fall  within  the  ellipse  correspond  to  those  points  in  the  u„,un  plane  for  which 

to  ■ 


«n)  >  (2 x)-'(<Tu<r„)-'exp(—z). 


(6.14) 


Let 


a  —  F{T  >  t0)  =  P{un,  vn  fall  outside  the  ellipse  given  by  eg  (6.12)).  (6.15) 
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It  is  well  known  that  the  PDF  of  the  random  variable  T  defined  by  eq  (6.11)  has  a  Chi-Square 
distribution  with  two  degrees  of  freedom  [53]  and  is  given  by 


fT(t)  =  0.5exp(-|). 

(6.18) 

Hence, 

a  =  1  -exp(-j). 

(6.17) 

Consequently,  to  —  ■ 

-•2/n(l  —  a).  Thus,  eq  (6.12)  becomes 

(6.18) 

oc  is  known  as  the  significance  level  of  the  test.  It  is  the  probability  that  Qn  falls  outside  the 
ellipse  specified  by  eq  (6.18)  given  that  the  data  is  coming  from  a  Gaussian  distribution.  1  —  a 
is  known  as  the  confidence  level  and  the  corresponding  ellipse  is  known  as  the  confidence  ellipse. 
Eq  (6.12)  can  be  written  in  the  standardized  form 


«n  ,  K  ~  f*v)* 


(6.19) 


where  the  lengths  of  the  major  and  minor  axes  are  given  by  max  [<ru y^o,  crv  v^o]  and  min  [<;uV/^,  <rvV/ 
respectively.  From  eq  (6.17),  observe  that  smaller  values  of  a  correspond  to  larger  values  of  t0. 
Consequently,  the  confidence  ellipses  become  larger  as  the  confidence  level  is  increased. 

For  a  given  sample  size  n  (n  <  100)  approximate  values  of  f. iv ,  cr*  and  <r*  can  be  obtained 
from  eq  (6.9).  The  confidence  ellipse  of  eq  (6.18)  can  then  be  used  to  make  a  visual  test  of  the 
null  hypothesis.  If  the  terminal  sample  point  falls  inside  the  ellipse,  then  the  data  is  declared  as 
being  consistent  with  the  Gaussian  distribution  with  confidence  level  1  -  a .  Otherwise  the  null 
hypothesis  is  rejected  with  a  significance  level  a. 

A  major  difficulty  in  determining  the  joint  PDF  of  Vn  and  Vn  is  that  the  coefficients  of  skewness 
and  kurtosis  of  Un  and  Vn  (see  Table  5.1)  indicate  that  the  Gaussian  approximation  for  the 
bivariate  PDF  may  not  be  satisfactory  for  n  <  10.  The  empirical  bivariate  PDF  of  Un  and 
Vn  were  obtained  by  using  50,000  Monte-Carlo  trials  for  n-3,  10,  20,30,  50  and  100.  The 
corresponding  probability  contours  are  shown  in  Figure  6.2.  The  same  procedure  is  used  even 
when  the  null  distribution  is  different  from  the  Gaussian  distribution.  However,  note  thal  the 
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standard  Gaussian  distribution  is  always  used  as  the  reference  distribution  for  determining  the 
angles  0,. 

6.4  Distribution  Approximation 

In  this  section  we  present  a  graphical  procedure  for  approximating  the  underlying  PDF  of  a 
set  of  random  data  baaed  on  the  goodness-of-fit  test  procedure  discussed  in  section  6.3. 

Following  a  similar  approach  to  that  outlined  in  section  6.3,  random  samples  are  generated 
from  many  different  univariate  probability  distributions.  For  each  specified  distribution  and  for  a 
given  n,  the  statistic  Qn  =  (Uu,  Vn)  given  by  eq  (6.8)  is  obtained  for  various  choices  of  the  shape 
parameter.  Thus,  each  distribution  is  represented  by  a  trajectory  in  the  two  dimensional  plane 
whose  coordinates  are  Un  and  Vn.  Figure  6.3  shows  an  example  of  such  a  representation.  Twelve 
distributions,  namely  Gaussian  (1),  Uniform  (2),  Exponential  (3),  Laplace  (4),  Logistic  (5), 
Cauchy  (6),  Extreme  Value  (7),  Gumbel  type-2  (8),  Gamma  (9),  Pareto  (10),  Weibull  (11)  and 
Lognormal  (12),  arc  represented  in  this  chart.  The  value  cf  Qn  at  each  point  of  the  trajectories  is 
obtained  by  Monte-Carlo  experiments  using  the  standard  Gaussian  distribution  as  the  reference 
distribution  for  determining  the  angles  The  results  are  based  on  averaging  1000  trials  of  50 
samples  from  each  distribution.  The  samples  from  each  distribution  are  obtained  by  using  the 
1MSL  subroutines  for  specified  values  of  the  shape  parameter.  Since  the  procedure  is  location 
and  scale  invariant,  the  trajectory  reduces  to  a  single  point  for  those  PDFs  which  do  not  have 
shape  parameters  but  are  characterized  only  in  terms  of  their  location  and  scale  parameters.  By 
way  of  example,  the  Gaussian,  Laplace,  Exponential,  Uniform  and  Cauchy  PDFs  are  represented 
by  single  points  in  the  {/„  —  Vn  plane.  However,  those  PDFs  which  have  shape  parameters  are 
represented  by  trajectories.  For  a  given  value  of  the  shape  parameter,  a  single  point  is  obtained 
in  the  Un  —  Vn  plane.  By  varying  the  shape  parameter,  isolated  points  are  determined  along 
the  trajectory.  The  trajectory  for  the  PDF  is  obtained  by  joining  these  points.  In  a  sense  the 
trajectory  represents  a  family  of  PDFs  having  the  same  distribution  but  with  different  shape 
parameter  values.  For  example,  the  trajectory  corresponding  to  the  Gamma  distribution  in 
Figure  6.3  is  obtained  by  joining  the  points  for  which  the  shape  parameters  are  0.2,  0.3,  0.5,  0.7, 
1.0,  2.0,  3.0,  4.0,  6.0,  10.0.  As  the  shape  parameter  increases,  note  that  the  Gamma  distribution 
approaches  the  Gaussian  distribution.  The  representation  of  Figure  6.3  is  called  an  identification 
chart.  Some  distributions  such  as  the  /?  distribution  and  the  SU-Johnson  system  of  distributions, 
have  twe  shape  parameters.  For  these  case3,  the  trajectories  are  obtained  by  holding  one  shape 
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parameter  fixed  while  the  other  is  varied.  For  these  distributions,  several  different  trajectories 
are  generated  in  order  to  cover  as  much  of  the  Un  -  Vn  plane  as  possible.  For  certain  choices  of 
the  shape  parameters,  two  or  more  PDFs  become  identical.  When  this  occurs,  their  trajectories 
intersect  on  the  identification  chart. 

It  is  apparent  that  the  identification  chart  of  Figure  6.3  provides  a  one  to  one  graphical 
representation  tor  each  PDF  for  a  given  n.  Therefore,  every  point  in  the  identification  chart 
corresponds  to  a  specific  distribution.  Thus,  if  the  null  hypothesis  in  the  goodness-of-fit  test 
discussed  in  section  6.3  is  rejected,  then  the  distribution  which  approximates  the  underlying 
PDF  of  the  set  of  random  data  can  be  obtained  by  comparing  Qn  obtained  for  the  samples  with 
the  existing  trajectories  in  the  chart.  The  closest  point  or  trajectory  to  the  sample  Qn  is  chosen 
as  an  approximation  to  the  PDF  underlying  the  random  data.  The  closest  point  or  trajectory 
to  the  sample  point  is  determined  by  projecting  the  sample  point  Qn  to  neighboring  points  or 
trajectories  on  the  chart  and  choosing  that  point  or  trajectory  whose  perpendicular  distance 
from  the  sample  point  is  the  smallest.  The  complete  approximation  algorithm  is  summarized  as 
follows. 

1.  Compute  Yk  as  specified  in  section  6.3 

2.  Obtain  the  standardized  order  statistic 

3.  Compute  Un  and  Vn  from  eq  (6.8). 

4.  Obtain  an  identification  chart  based  on  the  sample  size  n  as  discussed 
in  this  section.  Plot  the  sample  point  Qn  on  this  chart. 

5.  Compare  the  sample  point  Qn  with  the  existing  distributions  on  the 
chart.  The  nearest  neighboring  point  (or  trajectory)  on  the  chart  is 
used  as  an  approximation  to  the  PDF  of  the  samples. 

The  accuracy  of  this  procedure  can  be  increased  by  including  as  many  distributions  as  possible 
in  the  identification  chart.  However,  it  is  emphasized  that  this  procedure  does  not  identify  the 
under’ying  PDF.  Rather  it  identifies  a  suitable  approximation  to  the  underlying  PDF. 

6.5  Parameter  Estimation 

Once  the  distribution  of  the  samples  is  approximated,  the  next  step  is  to  estimate  its  pa¬ 
rameters.  The  method  discussed  in  section  6.4  lends  itself  for  estimating  the  parameters  of  the 
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approximated  distribution.  We  present  the  estimation  procedure  for  the  location,  scale  and  shape 
parameters  in  this  section. 

6.5.1  Estimation  of  Location  and  Scale  Parameters 


Let  f(x\  a,  0, )  denote  the  distribution  which  approximates  the  PDF  of  the  set  of  random  data, 
where  a  and  0  are  the  location  parameter  and  scale  parameter,  respectively  of  the  approximating 

PDF.  Let  A,':n  denote  the  ordered  statistics  of  X  from  a  sample  of  size 

n.  The  standardized 

ordered  statistics  are  defined  by 

I./  Xy,n  O ! 

w,.n  ----  g  . 

(6.20) 

Let 

fii:n  =  E[W,.n). 

(6.21) 

Then 

£[Arl;„]  =  0/Min  +  a 

(6.22) 

We  consider  the  following  statistics 

T,  =  £,  <?<«(*)*„„ 

(6.23) 

r2  = 

where  <9,  is  the  angle  defined  in  eq  (6.6).  The  expected  values  of  T\  and  T2 

are 

ElTj^ZiCosiOiMnin  +  a] 

(6.24) 

These  can  be  written  as 

E(T\)  =  aa  +  b0 

(6.25) 

E(T2)  =  cat  -f  d0 
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where 


a  =  £,  Cos(Oi) 

f>=Eifiv.r.Cos(9i) 

(6.26) 

c  =  £,  Sin(9i) 

d  =  £,  HunS\n(9i). 

Because  the  standardized  Gaussian  distribution  is  used  as  the  reference  distribution  for  0,,  it  can 
be  shown  that  a  =  0.  It  follows  that 


£  _  1 

**  b 

A  =  mrM 

c 


(6.27) 


where  the  symbol  A  is  used  to  dente  an  estimate.  For  n  sufficiently  large  (i.e.,  n  >  50),  suitable 
estimates  for  i?[7\]  and  E[T2]  are 


E[TX)  =  T, 

e[t2]  =  r2. 


(6.28) 


Estimates  for  b  and  d  rely  upon  an  estimate  of  min.  /u,;n  is  obtained  from  a  Monte  Carlo  simulat  ion 
of  Wi.n  where  Wi;n  is  generated  from  the  known  approximating  distribution  f(x;  0, 1)  having  zero 
location  and  unity  scale  parameters.  p,:n  is  the  sample  mean  of  VF,:„  based  upon  1000  Monte 
Carlo  trials.  Having  the  estimates  for  b  and  d  are  given  by 


b  =  Z?fonCos(6i) 
d  =  £"  j*i.nSitl(0i). 


(6.29) 


The  scale  and  location  parameters  are  then  estimated  by  application  of  eq  (6.27). 

6.5.2  Shape  Parameter  Estimation 

In  this  section  we  present  an  approximate  method  for  estimating  the  shape  parameter  of  the 
approximating  PDF.  This  procedure  can  be  used  only  when  one  of  the  shape  parameters  is 
unknown.  Let  7  denote  the  shape  parameter  of  the  approximating  PDF  being  estimated.  Since 
Un  and  Vn  are  location  and  scale  invariant,  the  point  Qn  depends  only  on  the  sample  size  n  and 
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the  shape  parameter  7.  The  expected  value  of  Un  and  Vn  can  be  expressed  as 


E(U»)  •  <pi(nn) 

B(Vn)-v a(tt,7) 


(6.30) 


where  y>i(., .)  .)  are  some  functions  of  7  and  n.  For  a  given  sample  size  n  and  shape  parameter 

70  the  corresponding  expected  point  ^i(n,7o)»  y’afa^o)  can  be  determined  approximately  in  the 
Un  ~  Vn  plane. 

The  proposed  shape  parameter  estimation  method  is  based  on  finding  a  point  such  that 


Un  =  V>i(n,7) 
Vn  =  W»(n,7) 


(6.31) 


where  7  is  the  sample  estimator  of  7.  However,  in  many  instances  the  sample  point  may  not 
correspond  exactly  to  a  particular  trajectory.  In  such  a  case,  let  E(Qin)  =  («i,Ui)  E(Qin)  = 
(u2,  v3)  denote  the  expected  points  corresponding  to  two  different  shape  parameter  values  7  =  71 
and  7  =  72.  It  is  assumed  that  the  sample  point  lies  in  between  the  points  corresponding  to  7i 
and  72.  Assuming  that  linear  interpolation  provides  a  satisfactory  approximation,  the  estimate 
of  the  shape  parameter  corresponding  to  the  sample  point  is  given  by 


7  w  71  + 


(72  -7i)(so  ~  “0 

(l<2  -  Ui) 


(6.32) 


where 


M(Vn-ViPM»Ui+t/,,] 

W+T) 


/  -  ia-s). 

(tia-u,)* 


(6.33) 


The  accuracy  of  the  procedure  can  be  improved  by  employing  a  non-linear  interpolation  method. 
It  must  be  emphasized  that  the  shape  parameter  estimation  procedure  presented  in  this  section 
is  an  approximate  procedure. 


6.6  Conclusions 

This  chapter  has  presented  a  new  algorithm  for  analyzing  univariate  random  data.  The  algo¬ 
rithm  provides  a  graphical  representation  for  goodness-of-fit  test  which  determines  whether  a  set 
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Figure  6.1:  Linked  Vector  Chart:Dashed  lines  P0=  Null  Linked  Vectors,  Solid  Lines  P,= 
Sample  Linked  Vectors 


of  random  data  is  statistically  consistent  with  a  specified  PDF.  Also,  a  graphical  procedure  is 
presented  for  the  problem  of  approximating  the  underlying  PDF  of  a  set  of  random  data.  Esti¬ 
mation  of  location,  scale  and  shape  parameters  of  the  approximating  PDF  have  been  discussed. 
Finally,  it  must  be  pointed  out  that  the  chief  advantage  of  the  algorithm  presented  in  this  chapter 
is  that  it  works  well  for  small  sample  sizes  between  50  and  100  samples. 
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Figure  6.3:  Identification  Chart  for  Univariate  Distributions  Based  on  1000  samples  (n=50) 
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Chapter  7 

Distribution  Approximation  of  Radar 
Clutter  by  SIRPs 

7.1  Introduction 

This  investigation  is  motivated  by  a  desire  to  characterize  correlated  non-Gaussian  radar  clutter 
by  approximating  the  underlying  probability  density  function  of  the  clutter.  Various  investigators 
have  reported  experimental  results  where  non-Gaussian  marginal  probability  density  functions 
(PDF)  have  been  used  to  model  the  clutter.  Usually,  radars  process  N  samples  at  a  time. 
Statistical  characterization  of  the  clutter  requires  the  specification  of  the  joint  PDF  of  the  N 
samples.  In  addition,  the  clutter  may  be  highly  correlated.  Hence,  the  joint  PDF  must  take  into 
account  the  correlation  between  samples.  Statistical  characterization  of  the  clutter  is  necessary 
if  an  optimal  radar  signal  processor  is  to  be  obtained.  For  use  of  the  well  known  likelihood  ratio 
test,  it  is  desirable  to  have  closed  form  expressions  for  the  joint  PDF  of  the  N  clutter  samples 
in  order  to  obtain  the  optimal  radar  signal  processor.  The  joint  PDF  of  the  N  clutter  samples 
can  be  easily  specified  when  the  clutter  is  Gaussian.  However,  when  the  clutter  is  non-Gaussian 
and  is  correlated,  many  different  joint  PDFs  of  the  clutter  samples  can  result  in  the  same  set  of 
marginal  (univariate)  distributions  having  a  specified  non-Gaussian  behavior.  The  multivariate 
non-Gaussian  PDF  can  be  specified  uniquely  only  when  the  random  variables  are  statistically 
independent. 

Specification  of  the  multivariate  PDF  is  generally  a  non-  trivial  problem  with  no  simple  best 
solution  [54].  As  explained  earlier,  the  theory  of  Spherically  Invariant  Random  Processes  (SIRP) 
provides  a  powerful  mechanism  to  obtain  the  joint  PDF  of  the  N  correlated,  non-Gaussian  clutter 
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samples.  Many  of  the  tractable  properties  of  the  Gaussian  random  process  also  apply  to  SiRPs. 
SIRPs  have  received  considerable  attention  over  the  past  two  decades  since  most  of  the  elegant 
and  mathematically  tractable  properties  of  the  multivariate  Gaussian  distribution  generalize  to 
this  class  of  distributions.  Applications  of  SiRPs  can  be  found  in  the  random  flight  problem  [27], 
signal  detection  [29],  speech  signal  modeling  [30]  and  radar  clutter  modeling  [32]  and  [34]. 

In  this  Chapter,  using  certain  properties  of  SiRPs,  we  adopt  an  algorithm  developed  in  [50] 
to  identify  the  underlying  distribution  of  a  given  set  of  data.  Section  7.2  provides  background 
information  about  SiRPs.  In  Section  7.3  we  present  a  procedure  for  the  goodness  of  fit  test 
for  PDFs  arising  from  SiRPs.  The  proposed  distribution  identification  algorithm  is  discussed 
in  Section  7.4.  Section  7.5  proposes  a  method  to  estimate  the  shape  parameter  based  on  the 
procedure  developed  in  Section  7.4.  Finally,  conclusions  are  presented  in  Section  7.6. 

7.2  Characterization  of  Elliptically  Symmetric  Distributions 

A  random  vector  X  =  [Xj,  X2,  . . .  X/v]T  is  said  to  have  an  elliptically  contoured  distribution 
if  the  characteristic  function  of  X  can  be  expressed  as 

4>x(w)  =  exp(ju)'1  (7.1) 

where  u>  and  p  is  an  N  by  1  vector,  X  is  an  N  by  N  positive  definite  matrix  and  $  is  an 
arbitrary  function  [37].  In  many  practical  applications  involving  Monte  Carlo  experiments,  a 
more  restricted  class  of  elliptically  contoured  distributions  are  used  because  of  their  relative 
simplicity.  This  class  of  distributions  called  elliptically  symmetric  distributions  (ESD)  and  has 
a  PDF  of  the  form 

/x(x)  =  *|E|-^Mp)  (7.2) 

where  k  is  a  normalization  constant  chosen  so  that  the  volume  under  the  curve  of  /x(x)  is 
unity,  p  =  (x  —  p)TS~1(x  —  p)  is  a  non-negative  quadratic  form  and  /iyv(p)  is  a  non-negative, 
monotonically  decreasing,  real  valued  function.  The  random  vector  X  having  a  PDF  of  the  form 
of  eq  (7.2)  is  also  called  a  spherically  invariant  random  vector  (SIRV).  The  constant  k  is  equal 
to  (27r)-^.  In  this  Chapter  we  shall  restrict  our  attention  to  SIRVs.  A  representation  theorem 
for  SIRVs  [28]  states  that  if  a  random  vector  is  an  SIRV  then  there  exists  a  non-negative  random 
variable  S  such  that  the  PDF  of  the  random  vector  conditioned  on  S  is  a  multivariate  Gaussian 
PDF.  In  mathematical  terms,  we  consider  the  product  given  by  X  =  Z S  where  X  is  an  SIRV,  S 
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is  a  non-negative  random  variable  having  PDF  fs(&)  and  Z  is  a  Gaussian  random  vector  having 
the  same  dimensions  as  X.  Then,  we  can  express  A/v(p)  as 

MP)  =  ("  )fs(,)d»  (7.3) 

where  p  is  the  previously  defined  quadratic  form.  The  PDF  of  the  random  variable  S  (i.e.  fs(*)) 
is  called  the  characteristic  PDF  of  the  SIRV.  We  define  a  spherically  invariant  random  process 
as  random  process  (real  or  complex)  such  that  every  random  vector  obtained  by  sampling  this 
process  is  a  SIRV  having  the  same  characteristic  PDF. 

In  the  special  case  when  £  is  the  identity  matrix,  eq  (7.2)  represents  the  PDF  of  a  spherically 
symmetric  random  vector.  This  is  due  to  the  fact  that  the  PDF  in  such  a  case  is  a  function  of 
xTx.  Elliptically  symmetric  distributions  are  related  to  spherically  symmetric  distributions  in 
an  interesting  way.  If  Y  is  a  spherically  symmetric  random  vector,  then  the  random  vector  X 
which  has  an  ESD  can  then  be  obtained  by  the  linear  transformation  [28] 

X  =  AY  +  b  (7.4) 

where  A  is  an  N  by  N  matrix  such  that 

£  =  AAt  (7.5) 

and  b  is  a  known  N  x  1  vector.  Thus,  in  many  applications  it  is  sufficient  to  deal  with  spherically 

symmetric  distributions  and  generalize  the  results  to  elliptically  symmetric  distributions. 

Finally,  the  PDF  of  the  quadratic  form  appearing  in  eq  (7.2)  is  given  by 

Mp)  =  ^r;^MpMp)  (7.6) 

where  T(q)  is  the  Eulero-Gamma  function  and  u(p)  is  the  unit  step  function  [34].  It  has  also 
been  pointed  out  in  Chapter  3  that  the  PDF  of  the  quadratic  form  remains  unchanged  regardless 
of  whether  the  PDF  of  the  random  vector  is  spherically  symmetric  or  elliptically  symmetric.  For 
example,  in  the  multivariate  Gaussian  case,  the  PDF  of  the  quadratic  form  is  the  well  known 
Chi-square  distribution  with  N  degrees  of  freedom.  Therefore,  for  a  given  JV,  the  SIRV  (or 
spherically  symmetric  distribution)  is  uniquely  characterized  by  the  quadratic  form.  In  order  to 
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identify  the  PDF  of  the  underlying  SIRV  it  is  sufficient  to  identify  the  PDF  of  the  quadratic 
form.  This  attractive  property  of  SIRVs  enables  us  to  study  various  distributional  aspects  of 
the  corresponding  multivariate  samples.  When  a  radar  uses  coherent  processing,  the  joint  PDF 
of  the  2 N  quadrature  components  is  of  interest.  The  above  results  are  then  applicable  with  N 
replaced  by  2 N. 

7.3  Assessing  the  Distributional  Properties 

In  modeling  real  world  data,  the  first  step  is  to  determine  the  most  appropriate  PDF  that 
approximates  the  data.  In  the  univariate  case,  the  fit  and  assessment  of  the  goodness  of  fit 
for  various  distributions  has  been  studied  extensively  and  several  methods  are  available  for  this 
purpose.  However,  limited  success  has  been  achieved  for  the  multivariate  situation.  Although  a 
number  of  multivariate  distributions  have  been  developed,  the  multivariate  Gaussian  distribution 
has  been  the  focus  of  much  of  the  techniques  for  multivariate  analysis  [55]. 

Assessment  of  the  distributional  assumptions  for  multivariate  data  is  a  non  trivial  problem. 
Several  techniques  have  been  proposed  to  assess  multivariate  Gaussianity.  In  a  recent  paper 
Ozturk  and  Romeu  [52]  a  review  of  the  methods  for  testing  multivariate  Gaussianity  is  given. 
Many  of  these  methods  can  be  modified  or  generalized  to  develop  goodness  of  fit  methods  for 
elliptically  symmetric  distributions.  If  a  random  vector  Y  is  an  SIRV,  then  the  corresponding 
marginal  distributions  must  be  identical  except  for  their  location  and  scale  parameters.  Based 
on  this  property,  one  can  use  the  the  standard  univariate  goodness  of  fit  testing  procedures  to 
assess  the  degree  of  similarity  of  the  marginal  distributions  of  the  multivariate  data.  However, 
such  an  approach  does  not  provide  a  way  to  assess  the  joint  distribution  of  the  components  of 
the  multivariate  sample.  Recall  from  Section  4.5  that  SIRVs  can  be  characterized  in  terms  of  the 
quadratic  form  P.  Equation  (7.6)  provides  an  important  property  for  developing  goodness  of  fit 
test  procedures  for  SIRVs.  Specifically,  if  the  PDF  of  P  can  be  identified,  then  the  corresponding 
PDF  of  the  SIRV  can  also  be  identified.  In  fact,  many  tests  for  assessment  of  multivariate 
Gaussianity  are  based  on  the  use  of  this  quadratic  form  [56].  By  use  of  this  technique,  note 
that  the  multivariate  distribution  identification  problem  is  reduced  to  a  corresponding  univariate 
distribution  identification  of  the  quadratic  form.  Any  of  the  classical  goodness  of  fit  testing 
procedures  like  the  Kolmogorov-Smirnov  and  Chi-Square  tests  can  be  used  to  address  the  problem 
of  distribution  identification  of  the  quadratic  form.  However,  the  requirement  of  large  sample 
sizes  for  specifying  the  parameters  of  the  distribution  and  low  power  of  the  test  necessitate  use 
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of  alternate  procedures  that  are  more  efficient. 

A  general  algorithm  was  developed  in  [50]  to  test  for  univariate  and  multivariate  normality. 
In  this  section  we  propose  the  use  of  this  algorithm  for  performing  the  goodness  of  fit.  test  for 
SIRVs.  The  procedure  is  summarized  here  for  completeness.  Let  X  =  [X»,Xa. . .  A'/v]7  denote 
a  vector  of  observations.  For  each  observation  vector  of  size  n,  we  compute  the  corresponding 
quadratic  form  P;  ( t  =  1,2,...  n).  Our  goal  is  to  test  whether  the  transformed  sample  belongs  to 
a  certain  distribution  F(p;a, /?,  7)  where  a,  ft  are  the  location  and  scale  parameters,  respectively 
and  7  is  the  shape  parameter. 

The  standardized  order  statistics  are  denoted  by  V<:n  i  =  1, 2, ...  n  and  are  obtained  by  ordering 
the  Yk\  k  =  1, 2, ...  n  such  that  Fi.n  <  Yi-,n  <  . . .  <  5^.«. 


Ytm 


(Pirn  -  P) 

Sp 


(7.7) 


where  P  and  Sp  are  the  sample  mean  and  sample  standard  deviation,  respectively  of  P*,  k  = 
1,2, ...n.  The  ith  standardized  ordered  quadratic  form  sample  is  represented  by  a  point  Q,  = 
(t/;,  Vi)  in  a  two  dimensional  plane  where 


^  Ej=i  cos{n$(my.n)}\Yj.n\  ^  ^ 

In  the  above  equations  7r  =  3.14159,  is  the  distribution  function  of  the  standard  normal  PDF 
and  rrij.n  is  the  expected  value  of  the  jth  order  statistic  from  the  standard  normal  PDF. 

For  a  given  multivariate  sample,  the  points  Qi  ( i  =  1,  2, . . .  n)  are  plotted  and  joined  to  obtain 
a  linked  vector  chart.  Similarly,  using  the  expected  values  of  the  statistic  V}:n,  ( j  =  1,  2.. .  ,n) 
under  the  null  hypothesis  an  expected  linked  vector  chart  can  also  be  obtained.  The  proposed 
test  is  based  on  comparing  the  sample  and  expected  linked  vectors.  If  the  null  hypothesis  is  true, 
then  we  expect  that  the  sample  linked  vectors  will  follow  the  expected  linked  vectors  closely. 

Finally,  a  formal  goodness  of  fit  test  is  performed  using  the  terminal  point  of  the  expected 
linked  vectors  (i.e  Qn  =  ( U„ ,  Vn)).  A  confidence  contour  for  the  true  point  is  obtained  to  provide  a 
test  hypothesis.  If  the  terminal  point  of  the  sample  does  not  fall  inside  the  100(1— a)%  confidence 
ellipse,  then  the  corresponding  null  hypothesis  is  rejected  at  the  a  level  of  significance.  Note  that 
the  Qn  test  provides  an  interesting  graphical  representation  of  the  data.  An  example  of  such 
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graphical  representation  is  givrn  in  Fig  7.1  tor  testing  a  multivariate  Gaussian  distribution  with 
n  ~  50  and  N  =  4. 

It  should  be  noted  that  the  Q„  statistic  is  location  and  scale  invariant.  In  other  words  it  is 
independent  of  the  location  and  scale  parameters.  However,  it  depends  on  the  shape  parameter 
of  the  null  distribution.  Assessment  of  the  distributional  assumptions  of  distributions  that  hove 
shape  parameters  is  conceptually  different  from  the  corresponding  problem  for  distributions  that 
do  not  have  shape  parameters,  in  the  former  case,  we  test  whether  the  sample  comes  from 
a  particular  member  of  a  family  of  distributions  while  in  the  latter  case,  we  test  for  a  single 
distribution.  One  possibility  for  dealing  with  this  problem  is  to  specify  the  value  of  the  shape 
parameter  and  perform  the  test  in  the  usual  way.  If  the  shape  parameter  cannot  be  specified, 
then  an  adaptive  approach  which  uses  the  sample  estimate  of  the  shape  parameter  must  be 
employed. 

Advantages  of  using  the  Qn  procedure  are  explained  in  [50].  Usually  the  classical  goodness  of 
fit  tests  end  up  with  either  rejecting  or  accepting  the  null  hypothesis.  An  attractive  property  of 
the  Qn  procedure  is  that  it  provides  some  information  about  the  true  distributions  if  the  null 
hypothesis  is  rejected.  Using  this  property  an  algorithm  for  characterizing  and  identifying  the 
distributions  can  he  developed.  The  next  section  explains  these  ideas. 

7.4  Distribution  Identification  of  SIRVs 

Following  the  same  procedure  described  in  Section  6.4,  where  the  reference  distribution  was 
Gaussian,  an  identification  chart  can  be  generated  for  each  of  quadratic  form  PDFs  of  the  SIRVs 
listed  in  Tables  7.1  and  7.2.  Recall  from  Chapter  4  that  the  PDF  of  the  quadratic  form  is 
invariant  t,o  the  choice  of  ft  and  E.  Hence,  for  simplicity,  the  trajectories  for  the  PDFs  of  the 
quadratic  forms  of  the  SIRVs  listed  in  Tables  7.1  and  7.2  are  obtained  by  generating  the  SIRVs 
having  zero  mean  and  identity  covariance  matrix.  Each  point  on  a  trajectory  is  obtained  by 
averaging  the  results  of  2000  Monte  Carlo  trials  of  size  100.  As  before,  PDFs  which  do  not  have 
shape  parameters  are  represented  by  a  single  point  in  the  U-V  plane  while  those  which  have 
shape  parameters  generate  a  trajectory  in  the  U-V  plane  by  changing  the  shape  parameter. 

An  example  of  the  identification  chart  is  given  in  Fig  7.2  for  N  —  4  and  n  —  50  where  the 
expected  values  of  Qn  =  (Un,  1/„)  is  plotted  for  various  distributions.  The  Gaussian  distribution 
was  used  as  the  reference  distribution  for  determining  the  angles  of  the  linked  vectors.  The  SIRVs 
listed  in  Table  5.1  and  Table  5.2  are  included  in  the  chart  and  labeled  by  number.  It  is  noted 
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Table  7.2:  SIRVs  obtained  from  the  marginal  characteristic  function 


•*»2/v(p) 

Gaussian 

exp(-|) 

Laplace 

Cauchy 

K-distribution 

Student-t 

that  the  multivariate  Gaussian  (1),  Laplace  (2)  and  Cauchy  (3)  distributions  are  represented  by 
single  points  on  the  chart  while  the  multivariate  K-distribution  (8),  Chi  (9),  Generalized  Rayleigh 
(10)  Weibull  (11)  and  Rician  (12)  are  represented  by  trajectories.  The  Student-t  distribution  (4, 
5,  6,  7)  with  degrees  of  freedom  3,  5,  10  and  15,  respectively,  is  also  shown  in  the  chart.  The 
trajectories  for  each  distribution  were  obtained  by  joining  10  points  resulting  from  the  use  of 
the  distributions  with  parameter  values  listed  in  Table  5.3.  Each  point  in  the  chart  is  obtained 
by  simulating  2000  samples  from  the  corresponding  distributions.  The  methods  developed  by 
Rangaswamy  et  al.  [35,  57]  were  used  to  generate  the  multivariate  samples. 

The  identification  chart  that  provides  an  interesting  display  for  identifying  and  characterizing 
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Figure  7.1:  Goodness  of  Fit  Test  using  the  Q„  Procedure.  90,  95  and  99%  contours  for 
the  Gaussian  distribution.  Broken  Line  =  Null  distribution  Pattern 
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Table  7.3:  Shape  Parameters  of  the  SIRVa  Uged  for  the  Identification  Chart 


liwsjnriismmi 

0.1,  0.2,  0.3,  0.4,  0.5,  0.7,  0.9,  1.1, 1.5,  1.9 

IBSIQBBHHHi 

0.15,  0.2,  0.25,  0.3,  0.35,  0.4,  0.6,  0.8,  0.75,  0.95 

|  Gen.  Rayleigh 

0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  1.0,  1.5,  2.6 

iimmiBHi 

Rician 

0.15,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.85,  0.9 

the  distributions.  Also,  relationships  between  the  various  distributions  are  clearly  seen.  For 
example,  as  their  parameters  are  varied,  certain  distributions  approach  the  multivariate  Gaussian 
distribution.  Also,  for  appropriately  chosen  parameters,  the  multivariate  Weibull  distribution 
and  the  Generalized  Rayleigh  distribution  coincide.  For  a  given  N-variate  sample  of  size  n, 
the  statistic  Qn  based  on  the  sample  quadratic  forms  can  be  computed  and  plotted  on  the 
identification  chart.  Then  the  nearest  distribution  to  the  sample  point  is  identified  as  the  best 
candidate  for  the  underlying  true  distribution  of  the  data.  An  example  of  such  an  identification 
is  shown  in  Figure  7.2  where  a  well  known  data  set  (i.e.  Iris  Setosa  [58])  is  used  to  obtain  a  value 
for  Qn  and  is  denoted  by  the  point  S.  The  Iris  Setosa  data  consists  of  four  measurements  taken 
from  50  plants.  It  is  seen  from  Figure  7.2  that  the  best  candidate  for  approximating  the  data  is 
the  multivariate  Chi  (9)  distribution. 

We  point  out  that  there  are  other  methods  which  can  be  used  for  the  distribution  identification 
problem.  A  commonly  used  technique  is  the  Q  —  Q  plot.  To  identify  the  underlying  distribution 
the  sample  quantiles  are  plotted  against  the  expected  quantiles  of  a  reference  distribution.  Then 
the  resulting  shape  of  the  plotted  curve  is  taken  as  a  basis  for  identifying  the  corresponding 
candidate  for  the  true  distributions.  However,  the  identification  is  made  on  a  subjective  basis. 
Even  then  the  procedure  is  not  very  easy.  Another  well  known  approach  for  identifying  the 
distribution  is  to  characterize  them  via  their  skewness  (03)  and  kurtosis  (04)  coefficients.  In  this 
case,  all  the  distributions  are  represented  by  points  on  the  <*3-  a4  plane  and  the  sample  data 
point  is  compared  with  the  theoretical  distributions  in  the  same  way  as  in  the  Qn  procedure. 
However,  estimates  of  03  and  a4  are  known  to  be  highly  sensitive  to  extreme  observations  and 
therefore,  large  sample  sizes  are  necessary  to  perform  the  identification  for  a  given  degree  of 
accuracy. 

7.5  Parameter  Estimation 

It  is  well  known  that  the  maximum  likelihood  estimate  of  the  covariance  matrix  of  a  Gaus¬ 
sian  random  vector  is  the  sample  covariance  matrix.  Interestingly  enough,  it  has  been  shown 
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in  [59]that  the  maximum  likelihood  estimate  of  the  covariance  matrix  £  is  the  same  sample 
covariance  matrix  used  in  the  Gaussian  case  to  within  a  multiplicative  constant.  Because  Qn 
is  scale  invariant,  the  identification  procedure  for  SIRVs  can  proceed  without  knowledge  of  the 
multiplicative  constant. 

Prom  eq  (7.6),  it  is  clear  that  the  expected  value  of  the  quadratic  form  can  be  expressed  as 

E(P)  =  v?(./V,7)  (7.9) 


where  7  is  the  shape  parameter  of  the  distribution.  For  those  SIRVs  where  y?(.)  can  be  evaluated 
in  closed  form  and  is  invertible,  the  sample  mean  of  P ,  denoted  by  iP  can  be  used  to  estimate 
the  shape  parameter  according  to 

7  « (7.10) 

where  7s  —  £  Pi.  For  example,  in  case  of  the  K-distribution,  we  have  E(P)  =  2 uN  where  u 
is  the  shape  parameter  of  the  K-distribution.  Clearly,  the  shape  parameter  can  be  approximated 
Unfortunately,  it  is  not  always  possible  to  obtain  an  invertible  closed  form  expression 
for  <^(.,.).  The  shape  parameter  estimation  procedure  suggested  here  is  not  suitable  in  such  a 
case.  An  alternate  method  for  the  parameter  estimation  problem  is  then  needed. 

In  this  Chapter  we  propose  to  use  the  Qn  statistic  to  obtain  an  approximate  estimator  for 
the  shape  parameter.  The  underlying  procedure  is  explained  in  [50]  and  is  summarized  here. 
Let  the  points  (Ui,  Vj)  and  (t/2,  V2)  denote  expected  points  corresponding  to  parameters  71  and 
72  respectively,  of  a  given  SIRV.  If  these  points  are  the  nearest  points  on  the  curve  for  the 
identified  distribution  to  the  sample  point  Qn  -  ( Un ,  V'n),  then  by  using  a  linear  interpolation, 
an  approximate  estimator  of  7  is  given  by 


7  ss  71  -j- 


(72-7i)Qeq  —  Ui) 

{Ut-lh) 


(7.11) 


where 


{/KVn-V,)+^t/l+t/nl 

W+T) 


(7.12) 


The  accuracy  of  the  proposed  estimator  for  7  depends  on  the  distance  between  the  sample  point 
Qn  and  the  corresponding  curve.  If  necessary,  the  approximation  can  be  improved  by  using 
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non-linear  interpolation  methods. 

7.6  Conclusions 

In  this  Chapter  we  have  addressed  the  problem  of  distribution  approximation  of  radar  clutter 
under  the  assumption  that  the  clutter  can  be  characterized  as  a  SIRP.  First  and  foremost,  we 
have  shown  that  the  multivariate  distribution  identification  problem  for  SIRPs  can  be  reduced 
to  an  equivalent  univariate  distribution  identification  problem  of  a  non-negative  quadratic  form, 
resulting  in  considerable  processing  simplicity.  A  new  algorithm  which  provides  a  graphical 
representation  for  the  goodness  of  fit  test  and  the  distribution  identification  has  been  used. 
This  algorithm,  while  conceptually  simple,  is  extremely  efficient  while  dealing  with  small  sample 
sizes.  Therefore,  it  is  suitable  for  use  in  a  variety  of  practical  applications.  Finally,  based  on  this 
algorithm,  a  new  approach  has  been  proposed  for  estimating  the  shape  parameter  of  SIRPs. 
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Chapter  8 

Weak  Signal  Detection  -  Literature 
Review 


8.1  Weak  Signal  Problem 

In  radar  applications  it  is  found  that  the  received  target  signal  is  contaminated  with  clutter 
and  thermal  noise.  The  received  signal  due  to  undesired  reflections  from  land,  sea,  atmosphere 
etc.  is  called  clutter.  The  thermal  noise,  which  is  generated  by  the  receiver  hardware,  is  typically 
modeled  as  a  Gaussian  random  process.  This  kind  of  noise  is  always  present.  Depending  upon 
the  situation,  the  clutter  may  or  may  not  be  modeled  as  a  Gaussian  random  process.  Also, 
the  power  associated  with  the  background  clutter  may  be  orders  of  magnitude  larger  than  the 
receiver  thermal  noise  or  the  desired  signal  power. 

In  modern  radars,  temporal  and  spatial  processing  are  used  to  separate  the  target  from  the 
clutter.  For  example,  the  received  signal  from  a  target  having  a  radial  velocity  with  respect  to 
the  radar  will  experience  a  Doppler  shift.  If  the  target  spectrum  appears  in  the  tail  of  the  clutter 
spectrum,  then  conventional  frequency  domain  techniques  can  be  used  to  extract  the  target  from 
the  clutter.  Similarly,  if  the  spatial  spectrum  of  the  target  does  not  overlap  that  of  the  clutter, 
performance  will  be  limited  by  the  background  noise  rather  than  the  clutter.  In  this  research  use 
is  also  made  of  temporal  and  spatial  processing.  However,  we  are  interested  in  the  case  where  the 
target  temporal  and  spatial  spectra  cannot  be  separated  from  the  clutter.  By  definition,  this  is 
referred  to  as  the  weak  signal  detection  problem.  Given  a  Range-Doppler-Azimuth  cell  in  which 
a  target  is  to  be  detected,  it  is  assumed  that  the  signal  is  larger  than  the  background  noise  but 
much  smaller  than  the  clutter.  Hence,  even  after  temporal  and  spatial  processing,  performance 
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is  limited  by  the  clutter. 

Therefore,  it  becomes  very  important  to  identify  the  clutter  plus  noise  probability  density 
function.  This  density  function  is  the  Nth  order  joint  density  function  of  the  received  radar 
samples  rj,  r?, ...,  in  the  absence  of  a  target  signal.  The  received  waveform  can  be  modeled  as 
a  random  process.  Since  we  will  be  sampling  this  process  at  N  time  instants,  we  need  to  have  the 
knowledge  of  the  Nttl  order  joint  probability  density  function  (PDF)  of  the  N  random  variables. 
In  this  research  effort  the  performance  measures  of  radar  receivers  are  analyzed,  given  the  Nth 
order  PDF  associated  with  the  random  process. 

In  the  hypothesis  testing  problem,  where  we  have  to  decide  whether  the  target  is  present  or 
absent,  two  kinds  of  errors  can  occur:  1)  A  false  alarm  which  occurs  when  it  is  decided  that  the 

target  is  present  when  it  is  not,  2)  A  miss  which  occurs  when  it  is  decided  that  the  target  is  not 

present  when  it  is.  In  many  radar  problems  the  chosen  criterion  is  to  fix  the  false  alarm  at  a 
certain  value  and  then  to  maximize  the  probability  of  detection.  In  statistical  decision  theory 
the  Likelihood  Ratio  Test  (LRT)  is  optimum  for  these  kinds  of  problems.  The  LRT  evaluates  the 
likelihood  ratio  which  is  the  ratio  of  the  Nth  order  joint  PDF  under  the  alternative  hypothesis 
Hi  (signal  present  case)  to  the  Nth  order  joint  PDF  under  the  null  hypothesis  Ho  (signal  not 
present  case).  This  ratio  is  then  compared  to  a  certain  threshold  to  make  a  decision.  Under 
the  constraint  of  a  fixed  false  alarm,  the  Ney man- Pearson  receiver  obtained  on  the  basis  of  the 
likelihood  ratio  test  is  the  optimum  receiver. 

The  components  of  the  received  vector  r  can  be  written  mathematically  as 

H\  :  r,'  =  Si  -f-  di  (8.1) 

H0:  r<  =  di  i  =  l,2... W  (8.2) 

where  s,,  and  di  represent  the  desired  signal  return  and  the  additive  disturbance,  respectively. 
Also,  let  /rt(r|#i),  /«(r| Jtfo),  Mi),  denote  the  Nih  order  PDFs  of  R  under  Hu  R  under  H0 
and  the  disturbance.  In  general,  the  disturbance  may  be  composed  of  clutter  plus  noise.  Since 
it  is  not  possible  to  separate  the  clutter  and  noise  components  of  the  disturbance  when  the 
disturbance  is  measured,  we  focus  on  the  disturbance  itself.  As  the  signal  becomes  very  weak 
(i.e.  as  the  signal  to  clutter  plus  noise  ratio  (SCNR)  approaches  zero),  the  numerator  and  the 
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denominator  of  the  LRT  tend  to  become  identical.  This  is  due  to  the  fact  that 

/afeltfi) «  »)  =  Mi)-  (8-3) 

This  will  result  in  the  likelihood  ratio  being  approximately  equal  to  unity  independent  of  the 
received  signal.  Thus,  if  Tt  denotes  the  likelihood  ratio, 

too  too 

Pd  ~  I  fT.(Tt\Hx)dt.nPF=  /  fTATt\HQ)dta  (8.4) 

Jl)  Jf) 

where  Pd  and  Pp  represent  the  detection  and  false  alarm  probabilities.  Therefore,  the  LRT 
performs  poorly  in  the  limit  as  the  signal  strength  tends  to  zero. 

Even  though  the  problem  of  weak  signal  detection  in  radar  applications  is  of  great  interest, 
most  of  the  literature  by  various  researchers  has  been  devoted  to  strong  signals  in  a  clutter 
plus  noise  background.  Optimal  and/or  very  good  sub-optimal  schemes  have  been  proposed  to 
achieve  the  desired  level  of  performance.  Only  a  relatively  small  fraction  of  the  literature  is 
devoted  to  the  design  of  practical  schemes  for  the  detection  of  weak  signals.  In  this  report  we 
present  a  general  theory  for  developing  practical  detector  structures  for  weak  signal  problems. 
Also,  analysis  of  performance  is  carried  out  for  a  specific  case  where  the  background  clutter  is 
assumed  to  have  a  multivariate  student-T  distribution  and  the  signal  to  clutter  plus  noise  ratio 
(SCNR)  is  very  small.  In  such  problems  the  concept  of  the  Locally  Optimum  Detector  ( LOD ) 
is  used  to  come  up  with  the  decision  rule  which  is  also  a  ratio  test.  For  a  deterministic  signal,  a 
statistic  is  obtained  by  taking  the  ratio  of  the  derivative  with  respect  to  the  signal  strength  of 
the  Nth  order  joint  PDF  under  H\  to  the  Nth  order  joint  PDF  under  Ho-  The  limit  of  this  ratio 
as  the  signal  strength  tends  to  zero  is  evaluated  to  obtain  the  test  statistic  for  the  decision  rule. 
In  the  random  signal  case  the  test  statistic  is  a  ratio,  in  the  limit  as  the  signal  strength  tends 
to  zero,  of  the  second  derivative  with  respect  to  the  signal  strength  of  the  Nth  order  joint  PDF 
under  H\  to  the  Nth  order  joint  PDF  under  Ho-  This  approach  is  valid  when  it  is  known  that 
the  SCNR  ratio  is  very  small  but  the  actual  value  of  SCNR  is  unknown.  Thus,  the  LOD  turns 
out  to  be  a  Uniformly  Most  Powerful  (UMP)  test  for  the  class  of  problems  where  the  SCNR  is 
in  the  neighborhood  of  zero.  The  theory  of  LOD3  is  explained  in  detail  in  the  next  chapter. 
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8.1.1  Literature  Review 

The  concept  of  the  locally  optimum  detector  was  first  established  by  Neyman  and  Pearson  in 
their  paper  ‘Contributions  to  the  Theory  of  Statistical  Hypothesis  Testing’  [60, 61].  Subsequently 
this  was  applied  to  statistical  communication  and  signal  processing  by  several  researchers. 

David  Middleton’s  work  [62]  on  the  LOD  is  based  on  expanding  the  LRT  in  terms  of  a  power 
series  expansion  and  truncating  the  series  to  a  first  order  approximation.  In  the  limit  as  the 
signal  tends  to  zero,  the  canonical  structure  of  the  locally  optimum  detector  is  established  with 
very  weak  restrictions  on  the  statistical  properties  of  signal  and  noise.  The  analysis  applies 
equally  well  to  non-Gaussian  as  well  as  Gaussian,  non-stationary  as  well  as  stationary  processes, 
for  stochastic  as  well  as  deterministic  signals,  continuous  as  well  as  discrete  time  signals  and 
for  combinations  of  signal  and  noise  that  need  not  be  additive.  In  fact,  the  general  character 
of  the  results  is  independent  of  the  particular  nature  of  the  signal  and  noise,  although  specific 
noise  distributions  determine  the  specific  detector  structures.  Middleton  shows  that  the  locally 
optimum  detector  is  a  threshold  detector  with  very  strong  optimality  features  in  the  limit  of  an 
infinitely  large  number  of  samples.  However,  in  our  research,  we  are  interested  in  applications 
where  the  number  of  samples  may  not  be  too  large. 

For  a  variety  of  detection  problems,  Jack  Capon  [63]  concludes  that  implementation  of  the  LOD 
is  either  less,  or  no  more  complicated  than  the  Neyman- Pearson  detector.  Other  researchers 
in  this  area  such  as  John  Thomas  [64],  Saleem  Kassam  [48],  Conte  and  Longo  [65],  Shishkov 
and  Penev  [66]  have  all  obtained  performance  of  the  LOD  under  the  asymptotic  condition  of 
an  infinitely  large  number  of  samples.  These  researchers  have  modeled  the  noise  samples  as 
independent,  identically  distributed  random  variables.  This  enables  them  to  have  a  closed  form 
expression  for  the  Ntk  order  PDF  of  multivariate  non-Gaussian  noise.  Applying  the  LOD  test, 
they  have  arrived  at  the  decision  statistic.  Using  the  centra!  limit  theorem,  the  test  statistic 
is  shown  to  approach  Gaussian  in  the  limit  of  very  large  sample  size.  Then  the  performance 
measures  are  evaluated.  Shishkov  and  Penev  [66]  have  considered  correlated  interference,  but 
have  restricted  themselves  to  multivariate  Gaussian  interference.  Modestino  and  Ningo  [47] 
were  amongst  the  earliest  researchers  to  consider  weak  signal  detection  arising  from  bandpass 
processes.  They  have  modeled  the  received  signal  as  statistically  independent  complex  samples 
and  then  obtained  the  joint  density  function  of  the  inphase  and  quadrature  components.  Under 
the  assumption  that  the  clutter  density  function  is  circularly  symmetric,  they  transform  the  joint 
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density  function  to  an  equivalent  one  involving  the  envelope  and  phase.  Martinez,  Swaszek  and 
Thomas[54],  ha^e  considered  the  case  where  the  noise  has  a  multivariate  Laplace  distribution, 
where  any  non-negative  definite  matrix  can  be  used  to  model  the  correlation  between  the  random 
variables.  However,  they  do  not  analyze  the  receiver  performance  for  small  sample  sizes  which 
is  the  case  of  practical  interest. 

8.2  Non-Gaussian  Correlated  Data 

Previously,  general  analytic  expressions  for  the  various  applicable  N,h  order  joint  non-Gaussian 
PDFs  which  allow  for  correlation  between  the  variables  were  unavailable.  As  a  result,  researchers 
in  the  past  assumed  independence  between  the  samples.  By  assuming  independence  between  the 
samples,  they  were  able  to  get  the  Ntk  order  PDF  as  a  product  of  the  marginals.  If  we  carry 
out  the  locally  optimum  test  using  the  Nth  order  density  function  based  upon  independence  and 
evaluate  its  performance,  it  is  found  that  an  unreasonably  large  number  of  samples  is  needed 
for  acceptable  performance.  This  arises  because  independent  samples  imply  a  white  spectrum. 
Consequently,  space-time  processing  cannot  be  used  to  filter  the  target  from  the  clutter.  Based  on 
the  concept  of  Spherically  Invariant  Random  Processes  (SIRP),  analytical  expressions  for  some 
Nth  order  joint  Non-Gaussian  PDFs  which  allow  for  correlation  between  the  variables  are  now 
available.  The  SIRP  was  explained  in  great  detail  in  Chapters  3-7.  Since  theoretical  evaluation 
of  receiver  performance  is  very  difficult  for  non-Gaussian  PDFs,  it  is  done  through  computer 
simulation.  The  computer  simulation  procedure  for  receiver  performance  evaluation  is  explained 
in  chapter  11.  This  performance  is  compared  with  that  of  the  Gaussian  receiver  to  see  the  gain 
obtained  due  to  the  added  complexity  of  the  locally  optimum  detector. 
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Chapter  9 

The  Locally  Optimum  Detector 


The  usual  criterion  in  radar  problems  is  to  maximize  the  probability  of  detection  under  a  fixed 
false  alarm  probability  constraint.  This  receiver  is  called  the  Neyman- Pearson  receiver.  The 
receiver  implements  the  Likelihood  Ratio  Test  (LRT)  and  compares  it  against  a  threshold  whose 
value  is  designed  to  give  the  desired  false  alarm  probability.  In  particular,  consider  the  received 
vector  RT  =  [/2i,/?2,...,/Zyv].  Introduce  the  two  hypotheses  Ho  and  H\  as  described  below: 


Ho :  r,  =  c,  +  ni  (9.1) 

H0  :  r,  =  Osi  +  c,  +  n,  *  =  1, 2..JV.  (9.2) 


Thus,  Ho  pertains  to  the  hypothesis  that  the  received  signal  consists  solely  of  clutter  plus  noise 
while  target  signal  is  assumed  to  be  present  under  the  hypothesis  Hi.  Let  the  joint  pr  bability 
density  function  of  Ri,Ri,...,Rn  under  hypothesis  //*  (k  —  0,1)  be  denoted  by  /g(r|/f*).  The 
Neyman- Pearson  receiver  performs  the  LRT 


t,(l)  = 


/fl(rjtfo)  Z* 


where  T)  is  specified  to  satisfy  the  false  alarm  constraint 


(9.3) 


PF  =  T  hMH0)dt. 

Jr) 


(9.4) 


and  /t„(<,|//*)  is  the  conditional  probability  density  function  of  the  test  statistic  Tt  given  hy¬ 
pothesis  Hk- 

However,  when  the  signal  strength  is  very  small  relative  to  the  clutter  plus  noise,  the  joint 
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density  function  of  the  received  random  variables  under  Hi  approaches  that  under  Ho.  Then  the 
numerator  and  the  denominator  of  the  LRT  become  approximately  equal  leading  to  numerical 
difficulties  in  discriminating  between  the  two  hypotheses.  The  Neyman- Pearson  test  is  of  course 
optimum.  However,  the  form  of  the  LRT  can  be  rearranged  to  yield  a  test  statistic  which  is 
more  sensitive  to  perturbations  in  the  received  data.  This  gives  rise  to  the  concept  of  the  Locally 
Optimum  Detector  (LOD).  In  this  chapter  the  concept  of  the  LOD  is  developed  in  detail  using  two 
approaches.  The  first  approach  is  based  on  a  power  series  expansion  of  the  LRT  and  the  second 
approach  derives  the  LOD  by  an  optimization  using  the  principle  of  Lagrangian  multipliers,  it 
is  shown  that  both  approaches  yield  identical  detector  structures,  though  starting  from  different 
theoretical  points  of  view.  As  the  signal  strength  becomes  weaker,  the  LOD  becomes  optimum 
even  though  its  performance  mav  not  be  as  good  as  desired  for  a  fixed  sample  size. 

9.1  The  Series  Approach 

9.1.1  The  Known  Signal  Case 

Let  the  additive  clutter  component  Q.  =  (Ct.Ca,  ...,Cs]T  be  stationary  and  independent  of 
the  stationary  white  Gaussian  background  noise  £[,  =  [TVi,  N?, ...,  ./Vw]r.  The  noise  variance  <r* 
is  assumed  to  be  several  orders  of  magnitude  below  the  clutter  variance  <r*  which  is  taken  to  be 
unity  without  loss  of  generality.  The  signal  is  assumed  to  be  of  the  form  0£,  where  S.  is  known. 
The  components  of  £  are  chosen  to  have  |$,-ja  =  i  so  that  the  positive  parameter  0  is  a  measure 
of  the  signal  to  clutter  ratio  (SCR)  defined  by 

sra-OfC-*-.  ,9.5, 

ac 

Because  the  clutter  and  noise  are  statistically  independent  with  the  noise  assumed  to  have  zero 
mean,  the  covariance  matrix  of  the  disturbance  vector  Q  =  Q_  +  iV,  denoted  by  Mo,  is  equal 
to  the  covariance  matrix  of  the  clutter  Me  plus  the  covariance  matrix  of  the  noise  Ms-  Since 
the  noise  is  white  and  stationary,  the  covariance  matrix  of  the  noise  is  of  the  form  Ms  =  /, 

where  I  is  the  identity  matrix.  When  the  clutter  is  highly  correlated,  the  covariance  matrix  Me 
tends  to  be  ill-conditioned.  However,  Mo  will  not  be  ill-conditioned  because,  by  adding  the  small 
value  <7*  to  the  diagonal  elements  of  Me,  the  smallest  eigenvalue  of  Mq  is  guaranteed  to  be  no 
smaller  than  cr\.  Also,  addition  of  Ms  to  Me  ensures  that  the  disturbance  spectrum  will  limit 
performance  even  in  those  frequency  intervals  where  the  clutter  spectrum  is  negligible. 
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With  this  approximation  the  LRT  takes  the  form 


T  _  /atelffi)  _  frtt-ii)  ">' • 
•  fdt  1%)  id  t)  ». 


(9.6) 


As  mentioned  previously,  when  0  1,  the  signal  OS.  represents  a  small  perturbation  in  the 

received  vector  under  hypothesis  H\.  Hence,  fft(r\H\)  approximately  equals  /n(z:|//o)-  As  a 
result,  Tj  is  relatively  insensitive  to  OS-  One  approach  at  deriving  a  weak  signal  detector  is  to 
expand  the  numerator  of  the  LRT  in  a  Taylor  series. 

For  this  purpose,  let  y  =  r  —  Os.  Then 


fdrJH,)  =  Ids)- 


(9.7) 


Expanding  foiy)  in  a  Taylor  series  about  the  received  vector  r,  we  obtain 

dyki 

'dykldyk,  -= 

,  1  £  £  ^ ,  S,  x  ,  \  dnfd!L) 

+  Z,  2L,  -  L  (y*<  “  r^i)(yh  -  rkl)>..(ykn  -  r*„)  « - ~ 


fdy)  =  fdt)  +  £  (y^i  - r*. 

*1=1 

,  1  A  A ,  u  ,  d'foii) 

+  2!  2-  2L  fa*«  -  r**  )(»*»  -  r*»)^„  5T  iv-r 


*1=1  *»=i 


*1~1  Ajsl  *„  =  1 


dykldyk3...dykn 


IV"t 


(9.8) 


This  can  be  expressed  in  vector  form  by  introducing  the  operator 

(l  “  tiTVv  =  E(V*  ~  rk)-7~  (9.9) 

where  the  subscript  y  on  V  indicates  partial  differentiation  with  respect  to  the  components  of  y. 
The  expansion  of  /d(|/)  about  the  point  £  =  r  then  becomes 

Ids)  =  fdt)  +  Kk  -  £)Tv«l/c(i!)lt-E 

+  jife-dTv»]Vo(»)U, 

+  ... 

+  -JrK?  -  £)Tv,]“/a(i)l,=. 

n!  - 
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(9.10) 


4- 

=  fdi)  +  £  nl(r -  n)Tv>]’1/a(i/)l,=t. 

n=l  ”• 


Recall  that  j I  =  r~$a,  where  9  and  s  are  constants.  Note  that  J/ —  I  =  —9a  and  =  g^-.  Then 

(l~r)TVv  =  j2(-0sk)~  =  - OsTVr  (9.11) 

where  the  subscript  r  on  V  indicates  partial  differentiation  with  respect  to  the  components  of  r. 
It  follows  that  the  expansion  may  be  written  as 

fair.  -  9s)  =  fo(r)  +  £  lll!lr[srVr]"/D(r).  (9.12) 

n- 1  n- 

In  order  for  the  above  expansion  to  be  meaningful,  it  is  necessary  that  all  the  derivatives  in  the 
above  expansion  exist. 

Thus,  using  the  above  expansion  of  /o(r  —  9s),  the  Taylor  series  expansion  of  the  likelihood 
ratio  about  the  received  vector  r  in  equation  9.6  can  be  written  as 

00 

T.(z)  =  i  +  [£  <9-13> 

The  first  term,  being  a  constant,  can  be  combined  with  the  threshold  without  loss  of  optimality. 
The  LOD  is  defined  to  be  the  term  corresponding  to  n  =  1  of  the  infinite  summation.  For 
9  <  1,  it  is  assumed  that  the  remaining  terms  in  the  summation  are  negligible.  On  the  other 
hand,  because  r  is  random  and  the  partial  derivatives  of  the  PDF  may  be  large,  the  remaining 
terms  may  not  be  negligible.  However,  it  is  assumed  that  this  occurs  with  small  probability.  The 
resulting  detector  structure  can  be  expressed  as 


T  r  ,  (sTVr)/o(r) 

Iiodkl)  - 7-7-7 -  <7? 

]D\L)  »o 


where  1}  is  chosen  so  as  to  achieve  the  desired  false  alarm  probability. 


(9.14) 
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0.1.2  The  Random  Signal  Case 

When  the  signal  is  random,  /fi(n|#i)  is  obtained  by  integrating  the  joint  density  function 
|#i)  over  all  possible  values  of  s.  Hence, 

/oo  too 

•OO  J  —00 

where  E»  denotes  the  expectation  operation  carried  out  with  respect  to  the  random  vector  £. 
Because  the  denominator  of  T,  in  equation  (  9.6)  is  independent  of  s,  the  Taylor  series  expansion 
of  the  likelihood  ratio  can  now  be  written  as 


T.(r) 


(9.16) 


Once  again,  as  in  the  known  signal  case,  the  unity  term  appearing  in  the  test  statistic  can  be  put 
into  the  threshold.  If  we  make  the  assumption  that  the  expected  value  of  the  signal  vector  is  Q, 
then  the  n  =  1  term  in  the  infinite  series  of  equation  (  9.16)  goes  to  zero.  Thus,  for  the  random 
signal  case,  where  the  signal  vector  has  zero  mean  the  LOD  is  defined  to  be  the  second  term 
(n  =  2)  in  the  infinite  series.  As  in  the  deterministic  signal  case,  9  is  assumed  to  be  small  enough 
such  that  the  remaining  terms  of  the  series  are  negligible  with  high  probability.  Consequently, 
the  LOD  for  the  random  signal  case  is  given  by 

TMz)  =  |V  (917) 


where  Ta2  represents  the  second  order  term  in  the  Taylor  series  expansion  of  T„.  The  above 
equation  can  be  rewritten  as 


Ta2(r) 
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£.[VWv][/d(!:)]  >v" 


H  0 


(9.18) 


where,  as  before,  r\'  is  chosen  to  achieve  the  specified  false  alarm  probability.  Lumping  the 
constant  y  with  the  threshold  and  recognizing  that 


£»[UTVr)2]  =  Et[Vjs  sTVr]  =  Vj>Vr, 


(9.19) 


where  P  is  the  covariance  matrix  of  the  signal  vector,  then  the  detector  structure  for  the  locally 
optimal  test  becomes 

Wd .  wm.  fV  (9.20) 

mir.)  "o 

9.2  The  Lagrangian  approach 

Consider  again  the  hypotheses  testing  problem  defined  in  equation  (  9.2).  Let  us  define  a 
nonrandomized  decision  rule  <f>(r )  such  that 


Hi  true  ( target  present ) 
Hq  true  (target  absent). 


(9.21) 


This  amounts  to  partioning  the  decision  space  into  two  regions,  Si  and  So.  A  target  is  declared 
if  the  vector  r  is  present  in  the  space  Si.  If  it  falls  in  the  space  So,  then  the  decision  is  made  that 
the  target  is  absent.  The  probability  of  detection  equals  the  probability  that  the  nonrandomized 
decision  rule  equals  unity,  given  that  hypothesis  H\  is  indeed  true.  This  probability  will,  in 
general,  be  a  function  of  0,  the  signal  to  clutter  ratio.  Denoting  0(0)  as  the  probability  of 
detection  we  have 


/OO 

<t>(r)fn(r\Hi)dr.  (9.22) 

•OO 

0(0)  is  defined  to  be  the  power  function  of  the  test.  The  false  alarm  probability  is  given  by 

/OO 

<t>(r)fR{L\Ho)(k.  -  a.  (9.23) 

-OO 

The  optimization  problem  to  be  discussed  in  the  next  section  imposes  the  constraint  that  the 

false  alarm  probability  be  equal  to  a.  a  is  also  defined  to  be  the  significance  level  of  the  test. 

9.2.1  The  Known  Signal  Case 

As  discussed  earlier,  in  the  limit  as  the  signal  strength  tends  to  zero,  the  probability  of  detection 
becomes  approximately  equal  to  the  probability  of  false  alarm.  Therefore,  instead  of  maximizing 
the  probability  of  detection,  one  approach  is  to  maximize  the  slope  of  the  pov/er  function  (0(6)) 
curve  at  the  point  0  equal  to  zero.  The  function  to  be  maximized  and  the  constraint  are  given 
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in  the  following  two  equations.  Maximize 


=  (jj  jH  tKdAtelftWdl*- 


(9.24) 


subject  to  the  constraint 

/  <Kt)/fi(d#o)<2c=*  <*•  (9-25) 

J —oo 

We  also  require  that  the  test  be  uniformly  most  powerful  (UMP)  in  the  sense  that  <jfc(r)  be 
independent  of  9  for  small  neighoorhoods  in  the  vicinity  of  9  =  0.  Notice  that  there  is  a 
derivative  with  respect  to  9  outside  the  integral  in  equation  (  9.24).  If  the  function  /g(r|//i)  is 
a  well  behaved  function  such  that  its  derivative  exists  at  all  points,  ti?..;  derivative  can  be  moved 
inside  the  integral  resulting  in 


3_ 

39 


J_^<l>{r)fR{r\Hi)dr  =  J  ^  ^p/n(r  ]  H\  )dr-V  J  ^  0(r)  (9.26) 


Because  of  the  UMP  requirement,  =  0  and  the  first  integral  in  equation  (9.26)  integrates 
to  zero.  It  follows  that 


3_  f°° 
39  j —co 


/  <£(r)/fl(r|//i)dr  =  f 

J- oo  J — oo  Ov 


(9.27) 


Given  the  function  ^p-\e^o  to  be  maximized  along  with  the  false  alarm  probability  constraint, 
the  functional  torn,  of  the  maximization  problem  using  the  Lagrange  multiplier  approach  is 


max 


i  /”  Hz)af-(^H,)  <ki>^ + -  jT  (9.28) 


where  rf  is  the  Lagrange  multiplier.  Expression  (9.28)  can  be  rewritten  as 

,d/a(dtf.) 


max 


*  i  r  mi1- 

J — oo 


39 


»?/R(d^o)]cfc]U=0  +  v<*- 


(9  29) 


To  maximize  the  above  integral,  the  decision  regions  should  be  chosen  such  that  the  integrand 
is  always  positive.  In  other  words,  the  decision  regions  are  chosen  such  that 


39  'fl=0 


(9.30) 
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As  was  pointed  out  in  the  previous  section,  fji(r\Hi)  is  identical  to  /&(£  —  #i).  Therefore,  the 
decision  rule  becomes 

Ml 

(9.31) 


dfpjt-S i).  >'  ,  ,, 

— §« — !•->  << "friz)- 


The  locally  optimum  detector  is  defined  to  be  that  detector  which  implements  the  ratio  test 

e/D(r-es) , 


Im  5 

wr  *’• 


ae 


(9.32) 


The  Lagrange  multiplier  rj  is  chosen  to  satisfy  the  false  alarm  constraint.  Note  that 


fo(r  -  6s)  -  fo(ri  -  0sur2  -  0s2, ...,  rN  -  0sN).  (9.33) 


As  a  result, 


dfoiL-Os) 

86 


dfD(r-es)d(ri-es1)  dfo{r-  6s)  d(r2  -  0s2) 
d{r\  ~6sx)  80  +  d(r2-0s2)  86 

,  dfoin  -  6s)  d(rN  -  0s  N) 

+  +  8{rK  -  0sN)  80 

ti  9(rk  ~  6sk) {  kh 


Consequently, 


dfpjr.  ~  6s) 
86 


I  v-'  8f[)(r) 

\e=o  =  -  —z:. — 


8rk 


-(sTVr)fc>(r). 


Thus,  the  locally  optimum  detector  can  also  be  written  as 


Tlod(l)  — 


(3rVr)/o(r)  »« 
Id{l)  «o 


(9.34) 


(9.35) 


(9.36) 


It  can  be  seen  that  this  detector  is  identical  to  the  one  in  equation  (9.14)  obtained  through  the 
series  approach. 

9.2.2  The  Random  Signal  Case 

Consider  a  random  s:gnal  .£  and  let  its  joint  PDF  be  denoted  by  fs(s).  Also,  without  loss 
of  generality,  we  can  make  the  assumption  that  the  signal  vector  has  zero  mean  and  that  each 
component  of  the  vector  has  unit  variance.  Given  the  signal  vector  £_  the  joint  density  function 
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on  the  received  vector  under  hypothesis  H\  is 


1)  =  Mt-Oa)- 


(9.37) 


The  power  function  for  the  locally  optimum  test  was  given  in  the  previous  section  in  equation 
(9.22).  However,  in  the  random  signal  case  the  unconditional  density  function  fp[r\Hi)  is  ob¬ 
tained  by  integrating  out  the  random  vector  &  from  the  joint  PDF  //^(n,  s\H\)  =  //?(r|s,  H\)fs(s). 
Use  of  equation  (9.37)  results  in 

/oo  too 

I  Hl)/d{l  ~  Os)fs(s)dr  ds.  (9.38) 

■OO  '/-OO 

The  false  alarm  constraint  is  once  again  given  by 


J -OO 


(9.39) 


As  before,  we  wish  to  maximize  mp-\o=Q.  If  the  function  fn(r  -  0s)  is  a  well  behaved  function 
such  that  its  derivative  exists  at  all  points,  then 


dm 

dO 


/oo  too 

/  Hr) 

•oo  •/— oo 


<9/o(r  -  6s) 

30 


fsU)dr  ds. 


(9.40) 


It  follows  from  equations  (9.35)  and  (9.36)  that 


me)  |  _ 

30  'e=0 


3/p(r) 

3rk 


Sk]fs(s)dr  ds. 


Because  of  the  zero  mean  assumption 


(9.41) 


/OO 

skfs(a)da.  —  0. 

■oo 


(9.42) 


We  conclude  that 


om 

30 


Ie= 


0 


(9.43) 


independent  of  the  choice  of  <f>(r).  Therefore,  to  maximize  the  ability  of  the  power  function  to 
increase  in  the  vicinity  of  the  origin,  we  maximize  |g_0  =  0.  As  before,  assuming  that  the 
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role  of  integration  and  differentiation  can  be  interchanged, 


d%p{0)  f°°  r°°  ,d2fn(r.~  $&) . ,  , 

do 2  “  J-oo  J- oo  ^  c  502  fsis)dL  -£• 


(9.44) 


However  from  equation  (9.34) 


37a(r  -  fla)  _  (9 

50s  “  &  h  -  ***) 


TV  /V 


d2fg{r  ~  0l)  5(r>  -  Osj) 


S  £  #(ri  -  &»;)0(r*  -  *«*) 


(-Sfc) 


N  N 

§  £  ^ 


52/a(g  ~  ga) 


SjS*. 


Hence, 


(9.45) 


(9.46) 


d2fD(L  —  Os),  d2fD(r)  ixtT  Xr.  ,  ,  ,  ^ 

— 1*-»  =  E  E  3^7  =  <v'  **  v')/afe)- 

Then  the  second  derivative  of  the  power  function  at  the  origin  takes  the  form 

gffk.  =  j ^  H’-H^rS.  4TV,)/o(r)/s(a)*:  da  =  j  ^  (S(r)£,(v^4  a3  V,)/c(t)  &. 

(9.47) 

Using  the  approach  of  Lagrange  multipliers  to  maximize  the  function  in  equation  (  9.47)  along 
with  the  constraint  (  9.39),  the  optimization  problem  can  be  written  as 


fOO  m  _  /oo 

maar[/  <^{r)E,{ V?  s  arVr)/fi(r)ig  +  »?[<*  ~  /  ^(£)/a(z:)flfc]]-  (9.48) 

J-OO  J-oo 


The  above  expression  can  be  rewritten  as 

max[  [  <t>(r)[Et(Vjs  arVr)/fi(r)  -  »?/£(r)]dr]  +  r)a.  (9.49) 

J— OO 

To  maximize  the  integral  the  decision  regions  have  to  b  3  chosen  such  that  the  integrand  is  always 
nonnegative.  The  resulting  decision  regions  yield  the  inequalities 

E.(Vj i  iT y.)/e(r)  >‘ (9.50) 

H0 
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If  the  covariance  matrix  of  the  signal  vector  is  denoted  by  P,  then  the  locally  optimum  detector 
can  be  written  as 

,  =  (V^Vr)/p(r)  ?■, 
foil)  »« 

As  a  general  rule  for  deriving  locally  optimum  tests,  note  that  we  maximize  at  the  origin  the  first 
non- vanishing  derivative  of  the  power  function.  For  the  known  and  the  purely  random  signal 
cases  the  first  non-vanishing  derivative  is  the  first  and  the  second  derivative,  respectively. 

9.3  Special  Cases 

In  this  section  LOD  structures  will  be  derived  for  two  special  cases.  In  the  first  it  is  assumed 
that  the  N  random  variables  in  the  disturbance  vector  D  are  statistically  independent.  With  this 
assumption,  the  joint  PDF  of  the  N  random  variables  is  obtained  as  a  produci  of  the  marginal 
density  functions  of  the  individual  random  variable.  In  the  second  the  N  random  variables  are 
modeled  as  arising  from  an  SIRP.  This  model  enables  us  to  write  the  joint  PDF  of  the  random 
variables  analytically,  accounting  for  the  correlation  between  the  random  variables.  The  locally 
optimum  detector  structures  are  derived  for  both  crises.  It  turns  out  in  both  cases  that  fhe 
detector  can  be  expressed  in  a  canonical  form.  This  canonical  expression  is  derived  for  both  the 

known  and  the  random  signal  cases. 

9.3.1  The  Known  Signal  Case 

9.3. 1.1  Independent  Random  Variables 

From  equation  (  9.32),  the  LOD  structure  in  the  known  signal  case  is  given  as 


dfpJi-6>) 


de _ I 

fair.) 


0=0 


(9.52) 


Let  the  N  random  variables  in  the  vector  22  be  independent  such  that  the  PDF  of  the  ith  random 
variable  is  /^(d,).  Therefore,  the  conditional  joint  density  functions  of  the  N  received  random 
variables  are  given  by 


N 


fRuR, . K*(rl>r2»**MrN|tfo)  =  n /*('.) 


i=i 


N 


fRuRt . Rw(o,r2,...,rjv|i/i)  =  n/fl.(r«  ~  *«•■)• 

t=i 


(9.53) 

(9.54) 
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The  numerator  in  the  ratio  test  of  equation  (  9.52)  is  evaluated  as 


dfdz. z eAl  i  -iLrrr/-  <  a  .wi  _  vv,  ^ 
06  5=0  0$ [J. I  fD<(r'  ^5»)]l^=o  —  5J{(  a*') 


W/"'(r,)  n  M<s))  i  *  <•  (S.55) 


i=i 


Thus,  from  equation  (9.52)  the  LOD  statistic  for  independent  random  variables  is  given  by 


TLOD{rur2,...,rN)  =  (9 .56) 

i=l  jD,\ri) 

where  /^(r,)  denotes  the  derivative  of  /^(r,)  with  respect  to  r,.  The  above  equation  for  the 
LOD  statistic  is  the  canonical  form  obtained  when  the  random  variables  are  independent.  For 
different  density  functions,  /£>,(r,),  the  detector  will  be  different,  although  its  structure  remains 
the  same.  The  canonical  form  of  the  detector  is  shown  in  Fig.  9.1. 

9. 3. 1.2  Random  Variables  Arising  from  an  SIRP  Distribution 

When  the  random  variables  of  the  disturbance  are  drawn  from  an  SIRP  distribution,  the  joint 
PDF  can  be  written  as 

ted)  =  ^/v/ij^p/aMp)  (9.57) 

where  p  -  df  M"xd,  M  is  the  covariance  matrix  for  the  N  random  variables  and  hN(p)  is  a 
positive  valued,  nonlinear  function  of  p.  The  numerator  of  the  ratio  test  in  equation  (  9.52)  is 
given  by 


-  <Li  1  ,  U1 

80  l<,=0  dO  *  2nNl2\M\il2  lfl=0 


1  d 

27r"/2|Af|‘/2  00 


{Mp)}|*=0. 


(9.58) 


In  terms  of  0  and  d  =  r-0s,  the  quadratic  form  p  equals  (r  -  Os)TM~ l(r  -  0s).  From  the  chain 
rule  for  differentiation  we  have 


|(Wri> -£(*,«>£ 

(9.59) 

From  the  expression  for  p 

=  -2(iTAf~‘r). 

(9.60) 

Making  use  of  equations  (9.58-9.60)  the  LOD  statistic  in  equation  (9.52)  becomes 

Tlod(t)  =  -2(47'A/-,r)*'v!PJ 

Mp) 

(9.61) 
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where  h'N(p)  denotes  the  derivative  of  the  function  h^ip)  with  respect  to  the  argument  p.  The 
LOD  statistic  in  equation  (9.61)  represents  the  canonical  structure  when  the  disturbance  is 
modeled  as  an  SIRP.  The  nonlinear  function  hs(q)  depends  on  the  particular  joint  density 
function  used  to  model  the  disturbance.  The  canonical  structure  for  the  detector  is  shown  in 
Fig.  9.2. 

9.3.2  The  Random  Signal  Case 
9.3.2. 1  Independent  Random  Variables 

The  locally  optimum  detector  is  given  by  equation  (  9.51)  when  the  signal  is  random.  Rewriting 
equation  (  9.51)  the  LOD  structure  is 


Tlod(l)  = 


(V^Vr)/a(!l) 


(9.62) 


P  is  the  random  signal  covariance  matrix.  For  convenience,  the  signal  random  variables  are 
assumed  to  be  independent  in  which  case  the  covariance  matrix  P  is  diagonal.  Let  the  diagonal 
elements  of  the  matrix  P  be  represented  by  of,  i  =  1,2, ...,  N.  Because  the  disturbance  random 
variables  are  also  assumed  to  be  independent,  the  joint  density  function  /o(r)  is  again  given  by 
the  product  of  the  marginal  density  functions  of  the  individual  random  variables.  Specifically, 


N 


foil)  -  UMn) 


«= i 


(9.63) 


Also,  when  P  is  diagonal, 


N  q2 

vrr/>vr  =  5>r 

i=i 


drf 


(9.64) 


Using  equations  (9.62-9.64)  and  following  the  same  steps  as  in  the  known  signal  case,  the  LOD 
statistic  can  be  derived  as 

Tlod{t )  -  ]Cg?7wrl  (9*65) 

JD,  (  ri ) 


1=1 


where  the  double  prime  indicates  second  derivative  with  respect  to  the  argument.  The  canonical 
structure  derived  above  is  shown  in  Fig.  9.3. 

9.3. 2. 2  Random  Variables  Arising  from  an  SIRP  Distribution 


When  the  disturbance  vector  is  modeled  as  having  an  SIRP  distribution,  the  joint  PDF  and 
the  LOD  structure  are  given  by  equations  (9.57)  and  (9.62),  respectively.  Since  the  constant 
terms  in  the  joint  density  function  cancel  out  in  the  numerator  and  denominator  of  the  ratio  test 
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known  signal  and  random  variables  arising 


in  equation  (9.62),  the  LOD  statistic  is  obtained  by  evaluating 


Tlod(l ) 


(VrrPVr)Mp) 
M  p) 


(9.66) 


The  locally  optimum  detector  statistic  that  results  from  the  above  equation  can  be  written 


as 


y  (r\  —  ^^n(p)Sm  4hN(p).£L>  T 

W)  ~  “W  +  *  1 


(9.67) 


where  Sm  represents  the  sum  of  all  the  elements  of  the  matrix  M~1  and  M~l  represents  the 
ith  column  of  M~l.  The  canonical  structure  of  the  detector  is  shown  in  Fig,  9.4. 
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Chapter  10 


Determining  Thresholds  for  the 
Locally  Optimum  Detector 

10.1  Introduction 

The  hypothesis  testing  problem  for  deciding  whether  or  not  a  target  is  present  is  given  by 
equations  (9. 1-9.2)  in  Chapter  9.  For  weak  signal  applications,  it  was  shown  that  the  Locally 
Optimum  Detector  is  useful  in  coming  up  with  a  decision  rule.  For  the  known  signal  case, 
the  LOD  structure  is  given  by  equation  (9.32).  Since  the  test  statistic  is  a  nonlinear  function 
when  MtlHo)  and  /d{l\Hi)  are  multivariate  non-Gaussian  density  functions,  it  is  not  possible, 
in  general,  to  analytically  evaluate  in  closed  form  the  threshold  t)  for  a  specified  false  alarm 
probability.  Given  the  probability  density  functions  (PDF)  of  the  test  statistic  denoted  by  T, 
under  hypotheses  H\  and  Ho,  the  detection  and  false  alarm  probabilities  are 

Jroo 

'  (10.1) 

*7 

Pf  =  I™  h(t\Hz)dt.  (10.2) 

Jr) 

Pd  and  Pf  are  represented  by  the  shaded  areas  shown  in  Fig.  10.1.  As  indicated  in  the  figure 
Pf  is  typically  much  smaller  than  Pd- 

In  practice,  the  density  function  of  T  is  not  known  in  advance.  For  example,  depending 
upon  various  conditions  such  as  terrain,  weather  etc.,  the  clutter  may  be  from  Gaussian,  K- 
distributed,  Weibull  or  some  other  probability  distribution.  It  has  recently  been  shown  [50]  that 
approximations  for  the  PDFs  on  T  can  be  determined  experimentally  using  a  relatively  small 
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number  of  samples  (eg:  50-100  samples  give  good  fits  depending  on  the  distribution).  Because 
the  number  of  samples  required  by  Ozturk’s  technique  is  small,  it  is  unlikely  that  actual  data 
samples  will  be  from  the  extreme  tails  of  the  PDFs.  Consequently,  the  good  fit  mentioned  above 
applies  to  the  main  body  of  the  density  function. 

In  order  to  establish  the  threshold  for  a  specified  Pp,  it  is  necessary  to  accurately  know 
the  behavior  of  the  tail  of  fr(t\Ho).  The  threshold  can  be  determined  through  Monte  Carlo 
techniques.  Unfortunately,  the  number  of  trials  M  required  is  given  by  the  rule  of  thumb 

M  >  ~.  (10.3) 

•  F 

Hence,  if  Pp  —  10"6,  at  least  one  million  trials  should  be  generated,  Clearly,  this  is  not  a  very 
desirable  situation.  In  this  Chapter  a  new  approach  is  developed  for  experimentally  determining 
the  extreme  tail  of  fT(t\H0)}  where  the  number  of  samples  required  is  several  orders  of  magnitude 
smaller  than  that  suggested  by  equation  (10.3).  Once  the  tail  of  /r(<|i/o)  has  been  estimated, 
the  threshold  can  be  determined  by  use  of  equation  (10.2). 

10.2  Methods  for  Estimating  Thresholds 

10.2.1  Estimates  Based  on  Raw  Data 

In  this  section  we  consider  some  commonly  used  threshold  estimates,  These  estimates  are 
called  raw  estimates  and  are  already  included  in  some  statistical  package  programs  (eg:  the 
UNIVARIATE  procedure  in  the  SAS  package). 

Let  X\  <  X-i  <  ...  <  X„  denote  the  sample  order  statistics  from  a  distribution  function  F(x). 
Let  p  denote  the  desired  false  alarm  probability.  Also,  let  n(l-p)  =  j  +  g  where  j  is  the  integer 
part  of  n(l-p).  We  denote  the  threshold  estimate  based  on  the  ki!x  procedure  to  be  described 
below  by  tjW.  Four  different  threshold  estimates  are  given  as  follows: 


Vp]  =  (1  -  g)Xj  +  gXj+1  (10.4) 

rjW  =  Xk,  where  k  is  the  integer  part  of  [n(l  —  p)  +  1/2]  (10.5) 

^3)  =  (1  -  6)Xj  +  6Xj+ 1,6  =  0ifg  =  0;  S  =  1  if  g  >  0  (10.6) 

44)  =  ^+1f(l-^)(Xi+Xi+1)/2,  6  =  0ifg  =  0-,  6=--lifg>0.  (10.7) 

It  is  known  that  all  of  the  above  methods  are  asymptotically  equivalent.  Thus,  if  a  large 

sample  size  is  used  (where  for  example  M is  determined  from  equation  (10.3)),  the  choice  of  the 
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best  method  is  no  longer  critical.  However,  in  an  empirical  study  [67],  it  has  been  shown  that 
tjff)  outperformed  the  other  estimators  when  g  =  0.  It  is  noted  that  the  methods  based  on  the 
above  estimators  are  restricted  by  the  condition  that  l<n(l  —  p)<n  —  1.  This  implies  that 
the  smallest  value  of  the  false  alarm  probability  p  cannot  be  lower  than  1/n.  Consequently,  the 
threshold  corresponding  to  the  smallest  false  alarm  probability  which  can  be  estimated  by  these 
procedures  depends  on  the  sample  size.  Thus,  for  a  reasonable  size  of  n,  estimation  of  thresholds 

for  small  false  alarm  probabilities  cannot  be  made  when  these  methods  are  used. 

10.2.2  Estimates  Motivated  by  the  Extreme  Value  Theory 

Extreme  value  distributions  are  obtained  as  limiting  distributions  of  largest  (or  smallest)  values 
of  sample  order  statistics.  Assuming  independent  trials,  if  X\  <  X2  <  ...  <  Xn  are  order 
statistics  from  a  common  distribution  function  F( x),  then  the  cumulative  distribution  function 
of  the  largest  order  statistic  is  given  by 

Gn(x)  =  P(Xn  <  x)  =  [F(x)]n.  (10.8) 

It  is  clear,  as  n  — ►  00,  that  the  limiting  value  of  (7„(ar)  approaches  zero  if  F(x)  is  less  than  1  and 
unity  if  F(x)  is  equal  to  1  for  a  specified  value  of  x.  A  standardized  limiting  distribution  of  X„ 
may  be  obtained  by  introducing  the  linear  transformation,  anXn  4-  bn,  where  an  and  bn  are  finite 
constants  depending  on  the  sample  size  n. 

In  Appendix  C,  using  the  theory  of  limiting  distributions  [68],  it  is  shown  that  if  there  exist 
sequences  an  and  6„  such  that 

lim  P(— — —  <  x)  "  lim  Fn(anx  +  bn)  =  Gn(onX  +  bn )  — ►  A(x)  (10.9) 

n— +00  n~+  00 

then  the  solution  of  the  above  functional  equation  yields  all  the  possible  limiting  forms  for  the 
distribution  function  Gn(x).  The  solutions  to  the  above  equation  are  derived  in  Appendix  C  and 
are  rewritten  here: 


A(x)  =  exp(—e~x)  x  >  0 

(10.10) 

A(x)  =  exp(~x~k)  x  >  0,  k  >  0 

(10.11) 

A(x)  =  exp(— (— x)k)  x  <  0,  k  >  0. 

(10.12) 

In  the  limit,  as  n  gets  large,  these  are  the  three  types  of  distribution  functions  to  which  the  largest 
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order  statistic  drawn  from  almost  any  smooth  and  continuous  distribution  function  converge. 
Therefore,  for  large  x,  the  tails  of  almost  all  smooth  and  continuous  probability  density  functions 

for  the  largest  order  statistic  also  converge  to  three  limiting  forms.  Prom  equations  (10.10)  and 

(10.11),  respectively,  the  two  limiting  forms  that  pertain  to  the  right  tail  (the  case  of  interest  for 
the  locally  optimum  detector  test  statistic)  are  [68] 

1.  ^^«tf(x)  =  e~x  (10.13) 

dx 

2  =  kx-(k+i)  >  0  (10.14) 

dx 

where  H(x)  approximates  the  probability  density  function  for  large  x.  The  first  equation  above 
is  the  well  known  exponential  distribution  and  the  second  equation  is  related  to  the  Pareto 
distribution.  The  details  that  lead  to  the  limiting  distributions  of  the  tails  are  shown  in  Appendix 

C. 

It  remains  to  be  explained  how  the  distribution  of  the  largest  order  statistic  is  related  to  the 
tails  of  the  underlying  PDF  from  which  the  samples  are  drawn.  The  relationship  is  based  on 
the  observation  that  inferences  from  short  sequences  are  likely  to  be  unreliable.  In  particular, 
instead  of  observing  k  sets  of  n  samples  and  taking  the  largest  order  statistic  from  each  of  the 
k  sets,  it  is  better  to  observe  a  single  set  of  nk  samples  and  use  the  largest  k  samples  from  this 
set  [69].  The  k  largest  order  statistics  from  a  vector  of  nk  observations  constitute  the  tail  of  the 
underlying  distribution  especially  when  n  is  very  large.  Therefore,  the  limiting  distribution  of 
the  largest  order  statistic  closely  approximates  the  tail  of  the  underlying  PDF  for  large  n. 

10.3  The  Generalized  Pareto  Distribution 

The  Generalized  Pareto  Distribution  (GPD)  is  defined  for  x  >  0  by  the  distribution  function 

G(x)  =  1  —  (1  +  7x/cr)-1/'y,  — oo  <  7  <  oo,<r  >  0, 7X  >  — <r.  (10.15) 

This  distribution  has  a  simple  closed  form  and  includes  a  range  of  distributions  depending  upon 
the  choice  of  7  and  a.  For  example,  the  exponential  distribution  results  for  7  =  0  and  the 
uniform  distribution  is  obtained  when  7  =  —1.  The  GPD  defined  in  equation  (10.15)  is  valid 
for  all  x  >  0  while  equations  (10.13)  and  (10.14)  are  valid  only  for  large  x. 
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The  probability  density  function  corresponding  to  the  GPD  is  given  by 


<K*) 


:[i  -  (i  +  2V*] .  I(i  +  a-i-. 


da; 1  a 

If  we  let  7  — ►  0  in  the  above  equation,  note  that 

lim  —(1  H - )  i  —  ~e  ». 


7~*o  a 


(10.16) 


(10.17) 


Also,  if  we  let  x  be  large  in  equation  (10.16),  note  that 


2*)-H 


1 ,7s_i_i  _i_i 
»  —  (  —  )  7X7. 

a  a 


(10.18) 


Equations  (10.17)  and  (10.18)  are  of  the  same  form  as  equations  (10.13)  and  (10.14).  Thus, 
the  GPD  can  be  used  to  approximate  both  types  of  tail  behavior  exhibited  by  the  right  tail. 
Typical  plots  of  the  Generalized  Pareto  PDF  are  shown  for  7  <  0  and  7  >  0  in  figures  10.2  & 
10.3. 

We  wish  to  set  thresholds  for  specified  false  alarm  probabilities  when  the  underlying  density 
functions  are  unknown.  To  set  very  small  false  alarm  probabilities,  the  tail  of  the  PDF  pi{l\Ho) 
has  to  be  accurately  modeled.  Figure  10.4  represents  a  typical  PDF  of  the  test  statistic  with  the 
tail  region  of  the  PDF  being  defined  as  that  to  the  right  of  t  =  t0.  Figure  10.5  shows  the  tail 
translated  to  the  origin.  The  choice  for  t0  is  somewhat  arbitrary.  For  example,  <0  can  be  chosen 
such  that  the  area  in  the  shaded  region  equals  0.1,  0.05  or  0.01.  It  is  the  portion  of  the  PDF  to 
the  right  of  <0  that  we  are  interested  in  modeling  by  the  GPD.  In  particular,  the  tail  region  of 
the  PDF  is  translated  to  the  origin  and  modeled  as  a  GPD.  Once  the  estimates  of  a  and  7  have 
been  obtained,  the  GPD  is  scaled  by  the  area  of  the  shaded  region  and  translated  back  to  the 

point  t0.  In  this  way,  the  area  under  the  PDF  of  the  test  statistic  is  maintained  at  unity. 

10.3.1  Methods  for  Estimating  the  Parameters  of  the  GPD 

Suppose  that  the  sample  ordered  statistics  X%  <  X2  <  ...  <  Xn  are  drawn  from  the  distribution 
function  F(x).  To  estimate  the  right  tail  of  this  distribution  it  is  necessary  to  determine  a  value 
(say  £0)  and  then  use  those  sample  observations  which  are  greater  than  x0  to  obtain  the  quantity 
z  =  x  —  Xq.  Once  the  tail  observations  have  been  chosen,  the  Generalized  Pareto  Distribution 
can  be  fitted  to  these  observations  by  using  standard  methods  of  parameter  estimation.  Observe 
that  the  portion  of  the  observations  used  from  a  complete  set  of  samples  depends  on  the  choice 
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p  dny 


Figure  10.4:  PDF  of  test  statistic  with  tail  region  defined  for  t  >  t0. 


of  xo.  One  approach  to  selecting  Xo  is  to  make  a  histogram  of  the  data  set  and  choose  xo  to 
be  near  the  point  of  inflection  of  the  histogram.  DuMouchel  [70]  proposed  choosing  xo  to  be 
the  value  such  that  fx(x)dx  =  0.1.  Such  an  approach  is  les3  subjective  and  appears  to  be 
satisfactory  for  many  applications.  However,  it  is  noted  by  DuMouchel  that  “  using  an  even 
smaller  fraction  of  observations  would  restrict  profitable  use  of  the  statistic  to  much  larger  sizes. 
On  the  other  hand,  to  use  more  than  the  upper  one  tenth  of  a  sample  would  seem  to  allow  too 
much  dependence  on  the  central  part  of  the  distribution.” In  other  words,  if  a  smaller  fraction  is 
used,  we  need  larger  sample  sizes  to  get  an  adequate  number  of  samples  for  estimation  and  if  a 
larger  fraction  is  used,  the  body  of  the  distribution  may  influence  estimation  of  the  tail. 

Let  xo  be  chosen  as  the  value  such  that  1  —  F(x<j)  =  fx(x)dx  =  a.  The  distribution 
function  to  be  used  in  approximating  the  tail  can  be  written  as 

F(x)  —  (1  —  a)  +  aG(x  —  x©)  --  1  —  a[l  — (x  —  xo)]”1^7  x  >  xo  tl0.19) 

(T 

where  G(x)  is  given  in  equation  (10.15).  Assuming  that  the  tail  of  a  given  distribution  can  be 
approximated  by  equation  (10.19),  then  the  estimation  problem  of  the  distribution  in  the  tail 
region  is  reduced  to  estimation  of  the  parameters  of  the  Generalized  Pareto  distribution. 

In  this  chapter  we  consider  three  methods  for  the  parameter  estimation  of  the  Generalized 
Pareto  distribution.  The  three  methods  are  maximum  likelihood  estimation,  the  method  of 
probability  weighted  moments,  and  the  ordered  sample  least  squares  approach.  The  first  two 
methods,  applied  to  the  GPD,  are  discussed  by  Hosking  and  Wallis  [71].  The  ordered  sample 
least  squares  approach  is  a  new  technique  developed  in  this  work.  The  performance  of  the  three 

estimation  procedures  are  compared  on  the  basis  of  estimation  bias  and  mean  square  error. 

10.3.1.1  Maximum  Likelihood  Estimation 

The  probability  density  function  corresponding  to  the  GPD  from  equation  (10.16),  with  x 
replaced  by  z ,  is 

,(2)  =  i(i  +  ^r*->.  (10.20) 

<7  <7 

Given  a  sample  vector  [z\,z%,  ...,zm]  from  the  GPD  the  joint  density  function  Lz_{s)  of  the  m 
samples,  assuming  independence  is  given  by 

**<*)=■  i  no +v>'H-  (10-21) 

°  i=i  a 
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To  theoretically  obtain  the  maximum  likelihood  estimates  of  a  and  7,  the  logarithm  of  the  joint 
density  function  in  equation  (10.21)  is  differentiated  with  respect  to  cr  and  %  respectively,  and  the 
derivatives  are  set  to  zero.  Let  the  largest  m  observations  from  the  unknown  distribution  whose 
tail  is  being  modeled  by  GPD  be  placed  in  the  vector  (xn_m+i ,  xn_m+2, xn).  Translation  of  the 
tail  region  to  the  origin  results  in  the  vector  (xn-m+i-^o,xn_m+2— xo,  ...,xn-x0]  =  [x*,  x2,...,  Zm]. 
Letting  r  =  7/(7  in  equation  (10.21)  and  differentiating  the  logarithm  of  the  joint  density  function 
with  respect  to  a  we  get 

j  j  m 

-foiogLzis)  =  ^-[m/o^(<r)  +  (l  +  (r<r)"1)2%(l  +’•*.)] 

tn  1 

=  7  +  (1"x)EM1+7^/4  (10.22) 

cr  (XT 

By  setting  equation  (10.22)  to  zero,  an  expression  for  a  that  satisfies  the  equation  is 

m 

<r(r)  =  53  log{  1  +  rx,)/(mr).  (10.23) 

The  expression  for  a  is  now  substituted  into  equation  (10.22),  so  as  to  obtain  a  function  of  r 
alone,  f  is  derived  by  differentiating  the  quantity 

m 

m  log  cr(r)  +  (1  +  1/(<t(t)t))  log(  1  +  tz> )  (10.24) 

«'=i 

with  respect  to  r  and  setting  the  derivative  equil  to  zero  with  the  constraint  that  rz,-  >  —1. 
However,  the  differentiation  leads  to  a  nonlinear  equation  whose  analytical  solution  is  not  known. 
This  difficulty  is  circumvented  by  minimizing  equation  (10.24)  numerically  with  respect  to  r. 
The  numerical  minimization  was  performed  using  the  Nelder-Mead  algorithm  [72].  Once  the 
estimate  for  r  has  been  obtained,  then  a  is  obtained  from  equation  (10.23)  and  7  is  estimated 
by  7  =  or. 

10.3.1.2  Probability  Weighted  Moments 

The  probability  weighted  moments  of  a  continuous  random  variable  Z  with  distribution  func¬ 
tion  G  are  the  quantities 

Mp,r,.  =  E[Z»Gr(Z)(  1  -  G{Z)Y\  (10.25) 
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where  E  is  the  expectation  operator  and  p,  r  and  a  are  real  numbers.  For  the  GPD  it  is  convenient 
to  choose  p  —  1  and  r  =  0,  respectively.  Then  the  probability  weighted  moments  are 

Mi, o,.  =  E[Z(  1  -  G(Z)Y\  (10.26) 


For  the  GPD  there  are  two  parameters  to  be  estimated,  a  and  7.  Substituting  s  =  0,  in  equation 
(10.26),  we  get 

Co  =  Mi, 0,0  =  E[Z ]  =  ["  -(1  +  l-Y^-'dZ.  (10.27) 

JQ  (J  (J 

Letting  1  +  ^  =  F,  equation  (10.27)  results  in 


Cq  = 


J™(Y  -\)Y-*TxdY 


a  .Y-y-1 

yLi+i 


Y~y 

Vir 


1  -7 


(10.28) 


Letting  3  —  1  in  equation  (10.26)  we  obtain 


ci  =  Mi, o,i  =  E[Z(  1  -  G(Z )]  =  I"  -(1  +  ^)‘*(1  +  ^—)~y~xdZ. 

Jo  <T  (J  <J 

Letting  1  +  &  —  Y,  as  before,  equation  (10.29)  results  in 


(10.29) 


Cl 


=  lf(Y-l)Y-*r'dY 

-  fLfI±l_I^loo 
"  72  + 1  -y1 

_  <r 
"  2(2^)' 


(10.30) 


The  values  of  60  and  cj  are  obtained  from  equations  (10.28)  and  (10.30),  respectively,  for  given 
values  of  a  and  7.  Since  there  are  two  equations  in  two  unknowns  a  and  7  can  be  obtained  as 
functions  of  Co  and  ei .  Solving  for  a  and  7  we  obtain 


a  =  2c0ci/(c0  -  2ci) 


(10.31) 
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and 


7  =  2-  c0/(co  -  2d)  (10.32) 

where  c0  and  cj  are  estimated  from  the  data  by  the  estimators  e0  =  E£i  *</m  and  ej  = 
I^i=!i(m  ~  *)*«/{w*(w*  —  1))  [71).  Once  the  values  of  Co  and  ci  are  obtained  the  estimates  of  a 
and  7  are  obtained  by  making  use  of  equations  (10.31)  and  (10.32).  Note  that  the  method  of 

probability  weighted  moments  involves  computationally  simple  expression  for  the  estimates. 
10.3.1.3  The  Ordered  Sample  Least  Squares  Method  -  A  new  approach 

The  procedure  used  in  maximum  likelihood  estimation  is  based  on  minimizing  the  quantity 
in  equation  (10.24).  Similarly,  the  probability  weighted  moment  estimates  are  obtained  by 
equating  with  the  sample  based  values  the  theoretical  values  of  the  quantity  E[Z(  1  -  G(Z))*], 
s=0,l,  where  Z  =  X  —  x0.  The  ordered  sample  least  squares  method  is  based  on  the  principle 
of  minimizing  the  squared  distance  between  the  ordered  sample  and  the  expected  value  of  the 
ordered  sample.  Computer  simulations  reveal  that  this  can  be  a  more  suitable  approach  for 
estimating  the  parameters. 

In  Appendix  C  the  method  for  evaluating  the  mean  and  the  variance  of  the  r11*  ordered  statistic 
from  a  sample  size  n  is  presented.  For  the  Generalized  Pareto  Distribution  the  mean  and  the 
variance  of  the  r  order  statistic  can  be  derived  since  the  probability  distribution  function  is 
known  in  closed  form.  Let  x  be  replaced  by  *  in  equation  (10.15)  and  let  G(z)  =  u.  Solution 
for  z  results  in 

*  =  Gr-1(«)  =  ^[(1 1).  (10.33) 

Making  use  of  the  above  equation  and  equation  (C.62)  in  Appendix  A,  the  expected  value  of  ZT 
is 

E{Z,)  =  7(7~)fr,-r)!!i'((1  -  -  'K'^1  -  (10.34) 

The  integral  in  the  above  equation  can  be  broken  into  two  parts  as  follows. 


E(Zr) = -jjV'a  -uY-'iu\. 


(10.35) 


From  results  presented  in  Gradshtyn  and  Ryzhik  [45],  the  expression  for  E(Zr)  becomes 


E(Zr) 


n!  .(r  —  l)!(n  —  r  —  7)! 

7  (r  -  l)!(n  -  r)!  (n  -  7)! 


(r  -  l)!(n  -  r)! 
n!  J 
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(10.36) 


_  gfnl(n-r-7)l  . 

7  (n  -  r)!(n  -  7)! 

=  gfr(n  +  l)r(n-r-7  +  l) 

7  T(n  —  r  +  l)r(n  —  7  +  1) 

To  calculate  the  variance  of  Zr)  we  first  calculate  E(Z 2).  Making  use  of  equation  (10.33)  and 
equation  (C.65)  in  Appendix  C,  the  expected  value  of  ZrJ  is 

B{Z;)  =  7(r-l)"(n-r)l[jC((1  _  u)”  "  1)V"(1  '  (1037) 

The  integral  in  the  above  equation  can  be  rewritten  as  follows: 

EW)  =  £(r  -  l )7(n-r)lIX((1  “  “>'*  "  “  “)~7  +  ‘K^O  -  «)— (10.38) 

Making  use  of  results  from  [73],  the  above  integral  evaluates  to 


E(Z?)  = 


a1  n!  (n  -  r  -  27)!  _  2(n  -  r  -  7)! 

72  (n  -  r)!  (n  -  27)!  (n  -  7)! 

a2  T(n-fl)  .T(n  —  r  —  27  +  1)  _  2r(n  -  r  -  7 -f  1)  . 

72  T(n  —  r  +  1)  T(n-27-f  l)  r(n  — 7  +  1) 


(10.39) 


From  equations  (10.36)  and  (10.39)  and  using  the  result  Var(Zr)  =  E(X*)  -  E2(Xr),  we  have 


Var(Zr)  = 


o_ 2  T(n  +  1)  ,r(n  -  r  -  27 -f  1) 

72  T(n  —  r  +  1)  r(n-27  +  l) 

2r(n  -  r  -  7)  +  1  ,  <7  r(n  +  l)r(n-r-7  +  l) 

T(n  -7  +  1)  7  T(n  —  r  +  l)F(n  -  7  +  1) 


l]}2.  (10.40) 


Simplifying  the  above  equation  results  in 


Var(Z  )  =  F(n+1)  r(n-r-27  +  l)  _  T2(n+1)  r2(n-r-7-M)1 

V  T}  72 T(n  —  r  +  1)  r(n-27  +  l)  r2(n-r  +  l)  r2(n-7  +  l)  J 


(10.41) 


Letting  Qr( 7)  —  ,  results  in 


E(Zr)-*-Z{Qr(  7)-l} 
Var(Zr)  =  a2  =  ^{Qr( 27)  -  (Qr( 7))2}. 


(10.42) 

(10.43) 
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A  computationally  simpler  expression  can  be  found  for  Qr( 7)  by  making  use  of  the  properties  of 
ga>  ^unctions.  Dividing  Qr('f)  by  <?P-i(7)  we  get 


Qrj  7) 
<?r-l(7) 


.Itetll  -  rfo-rzii*}  _  _  ,  1 

Pj[w-r4-iy  r(w-7ti)  n  —  r  -t- 1 

.['(’L+l)  IfeESM  “  n  -  r  -  7+  1* 

F(n— r+2)  r(n-7+l)  ' 


(10.44) 


Equation  (10.44)  reduces  to 


Qr( 7)  =  !!•*,(«  -  *  +  l)/(n  -  *  +  1  -  7). 


(10.45) 


To  find  the  least  squares  estimates  of  the  parameters  we  write  the  following  non-linear  model 
for  the  rth  sample  order  statistic 


Zr-E(Zr)  +  er,  r  =  1,2, ...,  m 


(10.46) 


where  the  error  term  er  has  a  distribution  with  mean  0  and  variance  <7?.  Since  the  order  statistics 
are  not  independent,  the  errors  are  also  not  independent.  Because  of  the  non-linear  structure  of 
the  model  in  equation  (10.46)  and  correlated  errors,  least  squares  estimation  does  not  offer  a 
straightforward  solution  to  the  estimation  problem.  Even  so,  in  this  study  we  proceed  to  use  the 
ordered  sample  least  squares  (OSLS)  procedure  to  estimate  the  parameters. 

In  equation  (10.42),  we  note  that  the  scale  parameter  a  appears  linearly  whereas  the  shape 
parameter  7  does  not.  The  least  squares  estimates  are  obtained  by  minimizing  the  quantity 

m  m 

s  -  "£ A  =  £<z-  -  - 1  )h?-  (10.47) 

r=l  r=l 


Since  a  appears  linearly  in  the  above  expression,  minimization  can  be  achieved  analytically. 
Differentiating  equation  (10.47)  with  respect  to  a  and  setting  the  derivative  equal  to  zero  results 
in 

m  _  1 

2£(2'  -  -(<Mr)  -  1)(--Wr(7)  - 1)  =  0.  (10.48) 

r=l  I  I 

The  solution  for  a  from  the  above  equation  is 


O'(t)  =  7 


E"  ,  Z,fflr(7)  -  1) 
Er=,(<?r(7)  -  l)2  • 


(10.49) 


The  expression  for  a  is  substituted  in  equation  (10.47)  and  the  resulting  expression  is  minimized 
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with  respect  to  7.  The  resulting  expression  after  the  substitution  is  nonlinear  and  minimization 
cannot  be  performed  analytically.  Using  the  Nelder-Mead  algorithm  [72],  the  minimization  is 
done  numerically.  Once  the  estimate  of  7  is  obtained,  <7  is  obtained  from  equation  (10.49). 

Recall  that  the  GPD  is  being  used  to  approximate  the  tail  of  the  underlying  distribution. 
Hence,  the  ordered  statistics  Zr>  r  —  1,2,  ...,m,  from  the  GPD  actually  correspond  to  the  ordered 
statistics  Xn-m+i  -  x0,  Xn_m+j  —  xo...Xn  -  xo  from  the  underlying  distribution. 

The  least  squares  procedure  results  in  a  computationally  convenient  algorithm.  It  is  empha¬ 
sized  that  the  minimization  of  S  is  carried  out  only  with  respect  to  the  single  parameter  7. 
Furthermore,  the  underlying  criterion  is  based  on  minimizing  the  distance  between  the  empirical 
values  and  the  expected  values  of  the  ordered  samples.  Some  numerical  comparisons  are  given 
in  section  10.4. 

10.3.2  Estimation  of  Thresholds 

The  Generalized  Pareto  Distribution  that  is  estimated  from  the  data  is  used  to  approximate 
the  tail  of  the  unknown,  underlying  distribution.  We  now  show  that  the  threshold  is  related  to 
the  approximating  distribution  function  in  a  direct  manner.  With  reference  to  equation  (10.19), 
let  rjp  denote  the  threshold  estimate  of  the  threshold  corresponding  to  a  false  alarm  probability 
p.  We  then  have 

F(fiP)  =  1  -p  .  1  -  a[l  +  l(t)p  -  x0)]“1/7.  (10.50) 

( 7 

Solution  for  fjp  results  in 

rjp  =  x0  +  <r(q~^  -  l)/7  (10.51) 

where  0  =  1  —  F(x 0),  q  =  (1  —  p)/a  and  xo  =  i'1-1(l  —  a).  For  many  applications  DuMouchel 
[70]  suggests  that  a  —  0.1  be  used.  As  will  be  discussed  in  the  subsequent  sections,  the  optimal 
value  of  o  depends  on  the  threshold  being  estimated.  Since  the  distribution  function  F(x)  is  not 
known,  Xq  cannot  be  determined  for  a  given  value  of  a.  Therefore,  following  common  practice, 
the  sample  order  statistic  Xn-m,  where  m  =  [on]  and  [  .  ]  denotes  the  integer  part  operator,  is 
used  as  an  estimate  of  x0. 
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10.4  Numerical  Results 

10.4.1  Characterisation  of  Tail  Shape  for  Known  Distributions 
We  first  discuss  a  method  for  estimating  the  parameters  of  the  GPD  when  the  underlying  distri¬ 
bution  is  known.  Choose  xq  such  that  1  ~.F(xo)  =  0.1.  Then  define  the  points  i=l,2,...1000  by 

Pi  =  0.90005  -I-  0.0001(*  -  1).  (10.52) 

Analytically  evaluate  x ,  =  F~i(pi)  from  the  known  distribution.  Using  the  1000  values  of 
X{ ,  the  maximum  likelihood  estimation,  the  ordered  sample  least  squares  and  the  probability 
weighted  moments  procedures  were  applied  to  determine  the  corresponding  7  values  for  various 
distributions.  The  results  are  given  in  Table  10.1.  The  number  in  parentheses  for  the  Weibull 
and  Lognormal  distributions  is  the  value  of  the  shape  parameter.  For  the  remaining  distributions 
the  number  denotes  the  degrees  of  freedom.  Since  <7  is  a  scale  parameter,  the  shape  parameter 
7  best  describes  the  tail  shape.  For  the  exponential  and  the  uniform  distributions  the  value  of  7 
can  be  obtained  theoretically.  7  =  0  for  the  exponential  distribution  and  is  —1  for  the  uniform 
distribution.  Since  the  size  of  the  tail  decreases  with  decreasing  7,  the  relationship  between  the 
tail  behavior  and  the  corresponding  values  of  the  shape  parameter  7  can  be  clearly  inferred  from 
this  table. 
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distribution 

“OSLS 

mzm 

k aaa 

Gaussian 

-0.151 

MMYim 

Weibull(3) 

K 

-0.168 

Weibu»(.67) 

0.129 

0.137 

Weibull(.5) 

0.265 

urn 

Student-t(3) 

r-  1 

0.260 

EE3I 

Student-t(5) 

0.099 

US 

Student-t(8) 

0.006 

-0.010 

Lognormal(l) 

Shi 

0.259 

Chi-square(l) 

1 

0.034 

0.044 

Chi-squarc(4) 

-0.024 

-0.033 

-0.034 

Chi-square(8) 

-0.047 

•0.058 

-0.064 

Table  10.1:  Tail  parameter  7  describing  the  upper  ten  percent  of  various  distributions. 


10.4.2  Empirical  Properties  of  the  Estimators  for  Known  Distributions 

Seven  distributions  with  widely  differing  tail  behaviors  were  chosen  in  order  to  investigate  the 
adequacy  of  the  approximation  of  extreme  tails  by  the  GPD  and  to  compare  the  three  estimation 
procedures.  The  gamma  distribution  and  Weibull  distribution  with  shape  parameter  of  value  3 
have  tails  lighter  than  those  of  the  exponential  PDF.  The  tails  of  the  chi-square  distribution  with 
4  degrees  of  freedom  and  the  student-T  distribution  with  8  degrees  of  freedom  are  approximately 
the  same  as  those  of  the  exponential  PDF.  Finally,  the  student-T  distribution  with  4  degrees  of 
freedom  and  the  Lognormal  distribution  with  shape  parameter  of  value  1  have  tails  heavier  than 
those  of  the  exponential  PDF. 

Let  rj  and  rj  denote  the  true  and  estimated  thresholds,  respectively.  A  Monte  Carlo  experiment 
was  performed  to  investigate  the  normalized  bias,  and  the  normalized  mean  square  error 
(!b!i)2  Qf  the  proposed  threshold  estimates.  The  four  sample  sizes  given  by  m  =  25,50, 100  and 
1000  were  considered.  Each  set  of  samples  was  obtained  by  generating  n  observations  and  taking 
the  largest  m  =  O.ln  observations.  For  example,  a  set  of  samples  of  size  25  was  obtained  by 
selecting  the  largest  25  observations  from  a  collection  of  size  250  samples.  For  all  four  different 
values  of  m,  k=200,000/m  trials  were  performed  for  each  of  the  seven  distributions.  The  median 
of  the  normalized  bias  values  were  computed  for  each  distribution  and  estimation  procedure.  The 
results  for  Pf  =  10~*,  k=2,3,...7  are  given  in  Table  10.2.  Similarly  the  median  of  the  positive 
square  root  of  the  normalized  mean  square  error  are  presented  in  Table  10.3.  The  results  in  the 
two  tables  differ  because  the  sign  of  (17  —  »?)/»/  is  lost  in  the  normalized  root  mean  square  values 
computed  in  Table  10.3.  Extremely  poor  estimates  for  rj  were  obtained  in  some  of  the  trials. 
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m=25 


Pf 

'lo-3-' 

“lCF3""" 

mam 

HUSH 

■asm 

in 

Normal 

OSLS 

-0.0112 

0.0043 

-0.0040 

-0.0276 

-0.0571 

Normal 

ML 

-0.0034 

0.0187 

0.0328 

0.0353 

0.0281 

Normal 

PWM 

-0.0084 

-0.0208 

-0.0560 

-0.1015 

-0.1464 

Weibull(3) 

OSLS 

-6.0048 

0.0013 

-0.0041 

-0.0202 

-0.0418 

-0.0619 

Weibull(3) 

ML 

0.0039 

0.0481 

0.0938 

0.1374 

0.1776 

0.2137 

Weibull(3) 

PWM 

-0.0037 

-0.0106 

-0.0333 

-0.0635 

-0.0919 

-0.1216 

OSLS 

-0.0424 

-0.0792 

-0.1658 

-0.2727 

-0.3872 

-0.4922 

ML 

-0.0166 

-0.1115 

-0.2526 

-0.4045 

-0.5416 

-0.6541 

■■ 

PWM 

-0.0218 

-0.0929 

-0.2160 

-0.3498 

-0.4761 

-0.5881 

t(8) 

OSLS 

mmm 

-0.0186 

-0.0572 

-0.1164 

e mm 

t(8) 

ML 

-0.0104 

-0.0468 

-0.1169 

-0.2077 

-0.3055 

m 

_____  t(8) 

PWM 

-0.0129 

-0.0452 

-0.1095 

-0.2039 

-0.3063 

bsbs 

Chi-sq(4) 

OSLS 

KflngjljJ 

0.0241 

Chi-sq(4) 

ML 

0.2518 

Chi-sq(4) 

PWM 

-0.0334 

mmwzm 

-0.1624 

Lognormal 

OSLS 

-0.0835 

-0.0982 

-0.0634 

0.0016 

0.1007 

Lognormal 

ML 

-0.0058 

0.1836 

0.5932 

1.2736 

2.4832 

4.4947 

Lognormal 

PWM 

-0.0543 

-0.0878 

-0.0931 

-0.0728 

-0.0228 

0.0639 

OSLS 

-0.0092 

0.0208 

0.0423 

0.0631 

0.0780 

ML 

-0.0030 

0.0523 

0.1190 

0.1868 

0.2479 

■ 

Pareto(-0.25) 

PWM 

-0.0077 

0.0052 

0.0121 

0.0199 

0.0237 

Table  10.2:  Median  of  the  normalized  bias  values  for  different  percentiles.  OSLS:Ordered 
Sample  Least  Square,  ML:Maximum  Likelihood,  PWMrProbability  Weighted  Moments 
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m=50 


Pf 

\msam 

HISyMHEffidHKIlBH 

ic-B 

mam 

Normal 

0.0036 

0.0073  -0.0068  -0.0354 

-0.0676  -0.1022 

Normal 

ML 

0.0042 

0.0323  0.0497  0.0578 

0.0528 

0.0380 

Normal 

PWM 

-0.0012 

-0.0118  -0.0459  -0.0861 

-0.1318  -0.1742  ] 

Weibull(3) 

Weibull(3) 

OSLS 

ML 

-0.0022 

0.0056 

-0.0007  -0.0133  -0.0337 

0.0500  0.0991  0.1436 

-0.0571 

0.1847 

-0.0838 

0.2199 

PWM 

-0.0014 

-0.0105  -0.0342  -0.0629 

-0.0937  -0.1256 

OSLS 

-0.0147 

-0.0646  -0.1800  -0.3209 

-0.4501  -0.5063 

ML 

-0.0068 

-0.0867  -0.2264  -0.3736 

-0.5120  -0.6291 

WmSSmM 

PWM 

-0.0078 

-0.0622  -0.1662  -0.2973 

-0.4233  -0.5391 

■a 

OSLS 

-0.0062 

-0.0222  -0.0841  -0.1723 

-0.2694  -0.3703 

ML 

-0.0031 

-0.0502  -0.1352  -0.2385 

-0.3460  -0.4517 

HE3SI 

PWM 

-0.0032 

-0.0336  -0.1064  -0.2041 

-0.3051  -0.4046 

OSLS 

-0.0092 

-0.0004  0.0051  0.0060 

-0.0498  -0.0686 

BSSB 

ML 

0.0115 

0.1134  0.2755  0.4775 

0.6368 

0.9150 

PWM 

-0.0041 

-0.0087  -0.0191  -0.0407 

-0.1123  -0.1488 

Lognormal 

OSLS 

-0.0544 

-0.0594  -0.0272  0.0458 

0.1573 

0.3274 

Lognormal 

ML 

0.0092 

0.2177  0.6336  1.3811 

2.G197 

4.7101 

Lognormal 

PWM 

-0.0302 

-0.0391  -0.0185  0.0413 

0.1480 

0.2977 

OSLS 

-0.0052 

0.0100  0.0214  0.0326 

0.0404 

6.0448 

ML 

0.0005 

0  0463  0.1011  0.1560 

0.2003 

0.2357 

ESSsESl 

PWM 

-0.0050 

-0.0018  -0.0012  -0.0019 

-0.0023  -0.0012  | 

Table  10.2:  Median  of  the  normalized  bias  values  for  different  percentiles,  (contd.) 
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poo 
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m=100 


Pr 

1 _ 

BBI 

10-j* 

w 

HISUH 

■11 

formal 

mmnu 

i 

mssm 

Normal 

ML 

0.0306 

0.0229 

0.0063 

-0.0185 

Norma) 

PWM 

-0.0549 

-0.1022 

•0.1524 

-0.1986 

OSLS 

0.0005 

-0.0017 

-0.0164 

-0.0376 

-0.0624 

-0.0888 

BSySS"  'SB 

ML 

0.0037 

0.0270 

0.0564 

0.0840 

0.1003 

0.1158 

tyi  | 

PWM 

0.0004 

-0.0095 

•0.0320 

-0.0607 

-0.0918 

-0.1220 

-0.0064 

-0.0441 

-0.1421 

-0.2680 

-0.3922 

-0.5031 

t(4) 

■ 

-0.0004 

•0.0564 

-0.1650 

-0.2907 

-0.4174 

-0.5354 

_ 

E3J 

-0.0003 

-0.0478 

-0.1403 

-0.2636 

-0.3809 

-0  4949 

hi~sq(4) 

hi-sq(4) 

hi-sq(4) 


Lognormal 

Lognormal 

Lognormal 


Pareto(-0.25) 

Pareto(-0.25) 

Pareto(-0.25) 


OSLS 

ML 

PWM 


OSLS 

ML 

PWM 


OSLS 

ML 

PWM 


W) 

OSLS 

-0.0024 

-0.0134 

-0.0751 

-0.1606 

-0.2578 

-0.3548 

t(8) 

ML 

0.0011 

-0.0342 

-0.1145 

-0.2123 

-0.3157 

-0.4216 

t(8) 

PWM 

0.0013 

-0.0271 

-0.0955 

-0.1888 

-0.2892 

-0.3916 

-0.0032 

0.0175 

-0.0004 


-0.0028 

0.1189 

-0.0089 


-0.0077 

0.2655 

-0.0238 


-0.0198 

0.4581 

-0.0448 


-0.0841 

0.5917 

-0.1143 


-0.1111 

0.8298 

-0.1520 


-0.0159 

-0.0111 

-0.0165 


-0.0542 

-0.0251 

-0.0210 


-0.0876 

-0.0068 

0.0141 


-0.1089 

0.0536 

0.0924 


-0.0940 

0.1499 

0.2315 


-0.0617 

0.3104 

0.3965 


-0.0023 

0.0033 

-0.0014 


0.0109 

0.0544 

0.0004 


0.0255 

0.1170 

0.0052 


0.0350 

0.1739 

0.0084 


0.0419 

0.2215 

0.0112 


0.0471 

0.2611 

0.0129 


Table  10.2:  Median  of  the  normalized  bias  values  for  different  percentiles,  (contd.) 
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m=1000 


Table  10.2:  Median  of  the  normalized  bias  values  for  different  percentiles. 
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Lognormal 

Lognormal 

Lognormal 


OSLS 

ML 

PWM 


OSLS 

ML 

PWM 


OSLS 

ML 

PWM 


0.0558 

0.1127 

0.2022 

0.2825 

0.3507 

0.4044 

0.0558 

0.0909 

0.1459 

0.2057 

0.2588 

0.3070 

0.0559 

0.1215 

0.2121 

0.2920 

0.3586 

0.4117 

0.0257 

0.0577 

0.1089 

0.1580 

0.2031 

0.2415 

0.0258 

0.0531 

0.0950 

0.1378 

0.1780 

0.2139 

0.0256 

0.0624 

0.1149 

0.1659 

0.2110 

0.2495 

0.1069 

0.2261 

0.4160 

0.5989 

0.7397 

0.8405 

0.1051 

0.2353 

0.4157 

0.5812 

0.7127 

0.8097 

0.1019 

0.2329 

0.4213 

0.5956 

0.7368 

0.8344 

0.0781 

0.1666 

0.3073 

0.4455 

0.5701 

0.6730 

0.0779 

0.1493 

0.2554 

0.3648 

0.4689 

0.5649 

0.0775 

0.1752 

0.3180 

0.4544 

0.5783 

0.6787 

0.0610 

0.1313 

0.2441 

0.3592 

0.4650 

0.5455 

0.0721 

0.2179 

0.4459 

0.7901 

1.1783 

1.7789 

0.0592 

0.1384 

0.2500 

0.3622 

0.4666 

0.5446 

0.1335 

0.2452 

0.4362 

0.6271 

0.7785 

0.8785 

0.1439 

0.4007 

0.7303 

1.4149 

2.7312 

5.0774 

0.1260 

0.2582 

0.4462 

0.6281 

0.7737 

0.8705 

0.0409 

6.0787 

0.1348 

0.1752 

0.2017 

0.219 

0.0402 

0.0763 

0.1419 

0.2075 

0.2640 

0.3127 

0.0411 

0.0866 

0.1430 

0.1817 

0.2084 

0.2240 

Table  10.3:  Median  RMS  errors  for  various  percentiles.  OSLS:Ordered  Sample  Least 
Square,  ML:Maximum  Likelihood,  PWM:Probability  Weighted  Moments 
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Pf 
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Normal 

0.0401 

0.0772 

0.1391 

0.1981 

0  2548 

0.3042 

Normal 

ML 

0.0394 

0.0689 

0.1122 

0.1559 

0.1959 

0.2328 

Normal 

PWM 

0.0399 

0.0865 

0.1530 

0.2192 

0.2759 

0.3273 

Weibull(3) 

OSLS 

0.01S0 

0.0393 

0.0743 

0.1135 

0.1511 

0.1854 

Weibull(3) 

ML 

0.0185 

0.0509 

0.0997 

0.1447 

0.1859 

0.2214 

Weibull(3) 

PWM 

0.0180 

0.0442 

0.0852 

0.1263 

0.1661 

0.2017 

^4) 

OSLS 

0.0779 

0.1826 

0.3506 

0.5179 

0.6633 

0.7724 

t(4) 

ML 

0.0768 

0.1910 

0.3602 

0.5244 

0.6688 

0.7762 

tti) 

PWM 

0.0760 

0.1778 

0.3332 

0.4899 

0.6303 

0.7386 

t(8) 

OSLS 

0.0561 

0.1228 

0.2316 

0.3503 

0.4666 

0.5698 

m 

ML 

0.0553 

0.1219 

0.2226 

0.3385 

0.4504 

0.5529 

_ *(•)  _ 

PWM 

0.0554 

0.1306 

0.2405 

0.3613 

0.4793 

0.5807 

Chi-sq(45 

OSLS 

0.0431 

0.0890 

0.1678 

0.2509 

0.3351 

0.4109 

Chi-sq(4) 

ML 

0.0489 

0.1661 

0.3386 

0.5487 

0.7664 

1.1112 

Chi-sq(4) 

PWM 

0.0426 

0.0939 

0.1747 

0.2584 

0.3431 

0.4185 

Lognormal 

OSLS 

0.0975 

0.1834 

0.3439 

0.5155 

0.6660 

0.7990 

Lognormal 

ML 

0.0993 

0.3381 

0.6769 

1.3921 

2.6297 

4.7240 

Lognormal 

PWM 

0.0864 

0.1954 

0.3510 

0.5143 

0.6621 

0.8012 

OSLS 

0.0289 

0.0534 

0.0890 

0.1162 

0.1346 

0.1486 

St 

ML 

0.0284 

0.0602 

0.1149 

0.1675 

0.2084 

0.2417 

PWM 

0.0293 

0.0616 

0.1032 

0.1320 

0.1533 

0.1666 

Table  10.3:  Median  RMS  errors  for  various  percentiles,  (contd.) 
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0.5497 

0.6627 
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t(8) 
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t(8) 
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0.0287 

0.0649 

0.1264 

0.1932 

0.2699 

0.3373 

Chi-sq(4) 

ML 

0.0350 

0.1437 

0.2959 

0.4688 

0.6092 

0.8592 

Chi-sq(4) 

PWM 

0.0283 

0.0686 

0.1289 

0.1948 

0.2730 

0.3383 

Lognormal 

OSLS 

0.0683 

0.1527 

0.2794 

0.4174 

6.5299 

0.6290 

Lognormal 

ML 

0.0652 

0.1515 

0.2690 

0.4039 

0.5465 

0.6769 

Lognormal 

PWM 

0.0647 

0.1417 

0.2519 

0.3805 

0.5218 

0.6710 

Pareto(-0.25) 

OSLS 

0.0201 

0.0372 

0.0637 

0.0845 

0.0997 

0.1110 

ML 

0.0197 

0.0568 

0.1192 

0.1746 

0.2221 

0.2613 

Pareto(-0.25)  | 

PWM 

0.0201 

0.0434 

0.0718 

0.0952 

0.1108 

0.1220 

Table  10.3:  Median  RMS  errors  for  various  percentiles,  (contd.) 
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m=1000 


Pf 

■EBfl 

10~3 

10“4 

10"4 

io-* 

Normal 

OSLS 

0.0077 

0.0182 

0.0373 

0.0643 

0.1017 

0.1440 

Normal 

ML 

0.0087 

0.0160 

0.0362 

0.0632 

0.1075 

0.1476 

Normal 

PWM 

0.0081 

0.0247 

0.0586 

0.1064 

0.1560 

0.2016 

Weibull(3) 

OSLS 

0.0037 

0.0086 

0.0194 

0.0393 

0.0630 

0.0890 

Weibull(3) 

ML 

0.0040 

0.0078 

0.0191 

0.0397 

0.0649 

0.0909 

wmsmm 

PWM 

0.0036 

0.0108 

0.0300 

0.0578 

0.0880 

0.1192 

m 

OSLS 

0.0203 

0.0534 

0.1383 

0.2476 

0.3717 

0.4763 

t(4) 

ML 

0.0213 

0.0447 

0.1083 

0.2168 

0.3326 

0.4406 

t(4) 

PWM 

0.0213 

0.0499 

0.1207 

0.2306 

0.3479 

0.4598 

t(8) 

OSLS 

0.0135 

0.0298 

0.0726 

0.1379 

0.2121 

0.3018 

t(8) 

ML 

0.0129 

0.0272 

0.0750 

0.1518 

0.2436 

0.3406 

t(8) 

PWM 

0.0129 

0.0349 

0.0939 

0.1830 

0.2863 

0.3863 

Chi-sq(4) 

OSLS 

0.0104 

0.0207 

0.0362 

0.0588 

0.1094 

0.1408 

Chi-sq(4) 

ML 

0.0099 

0.0192 

0.0363 

0.0589 

0.1095 

0.1429 

Chi-sq(4) 

PWM 

0.0100 

0.0211 

0.0400 

0.0602 

0.1103 

0.1433 

Lognormal 

OSLS 

0.0206 

0.0528 

0.1222 

0.1836 

0.2429 

0.3276 

Lognormal 

ML 

0.0195 

0.0434 

0.0984 

0.2012 

0.3581 

0.5999 

Lognormal 

PWM 

0.0201 

0.0410 

0.0927 

0.1919 

0.3445 

0.5770 

OSLS 

0.0061 

0.0101 

0.0158 

0.0213 

0.0247 

0.0278 

Pareto(-0.25) 

ML 

0.0063 

0.0092 

0.0154 

0.0198 

0.0243 

0.0268 

Pareto(-0.25) 

PWM 

0.0065 

0.0126 

0.0222 

0.0306 

0.0375 

0.0428 

Table  10.3:  Median  RMS  errors  for  various  percentiles. 
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These  poor  estimates  could  severely  influence  an  arithmetic  mean  of  the  estimates.  To  avoid  this 
problem,  median  values  were  used  in  place  of  the  arithmetic  means. 

The  empirical  results  in  Table  10.2  indicate  that  the  newly  proposed  ordered  sample  least 
squares  estimator  generally  has  a  smaller  normalized  bias  than  the  other  estimators  for  small  or 
moderate  sample  sizes.  Overall  the  second  smallest  normalized  bias  is  achieved  by  the  probability 
weighted  moments  method.  The  maximum  likelihood  estimator  has  the  largest  normalized  bias 
when  Pf  >  10-5,  especially  for  the  long  tailed  distributions.  The  normalized  bias  of  all  three 
estimators  decrease  as  the  sample  size  increases.  When  the  parent  distribution  is  GPD,all  three 
estimators  perform  very  well.  Even  so,  the  ordered  sample  least  square  estimator  outperforms 
the  others.  The  relatively  strong  performance  for  the  GPD  is  explained  as  follows.  The  extreme 
value  theory  is  based  on  the  premise  that  tails  of  smooth  continuous  distributions  tend  towards 
the  GPD.  For  the  GPD,  this  premise  is  exactly  satisfied.  Hence,  the  corresponding  performance 
is  noticeably  better  than  for  other  distributions. 

The  results  for  the  median  of  the  normalized  root  mean  square  error  are  surprising.  The 
maximum  likelihood  estimator  is  known  to  be  asymptotically  efficient.  This  is  always  true  when 
the  samples  are  drawn  from  the  underlying  distribution  (in  our  case  from  the  generalized  Pareto 
distribution).  This  property  of  the  maximum  likelihood  estimator  can  be  observed  in  Table  10.3 
when  m  =1000  but  not  for  smaller  sample  sizes.  Although  the  ordered  sample  least  squares 
method  has  a  smaller  normalized  root  mean  square  error  in  many  cases,  there  is  no  clear  winner 
with  respect  to  this  criterion. 

From  the  empirical  results  which  are  based  on  a  limited  number  of  distributions  and  sample 
sizes,  it  is  not  easy  to  make  a  strong  recommendation  as  to  which  method  to  use  in  practice. 
However,  in  terms  of  the  normalized  bias,  the  ordered  samples  least  squares  estimator  appears 
to  perform  better  over  the  other  estimators  in  estimating  the  large  thresholds  when  Pp  <  10-6. 
In  any  event,  it  is  seen  that  the  extreme  value  theory  can  be  used  successfully  to  determine 
threshold  values,  when  the  false  alarm  probability  is  very  small. 

Two  practical  advantages  of  estimation  based  on  extreme  value  theory  are:  1)  When  there  is 
a  constraint  on  the  number  of  samples,  the  thresholds  obtained  from  extreme  value  theory  are 
theoretically  expected  to  be  closer  to  the  true  thresholds  than  those  obtained  by  conventional 
Monte  Carlo  techniques.  However,  in  both  techniques  an  increase  in  sample  size  offers  greater 
accuracy  in  estimating  thresholds.  2)  Because  the  estimate  of  the  tail  of  the  underlying  distribu- 
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tion  is  in  closed  form,  estimation  can  be  made  for  thresholds  corresponding  to  extremely  small 
false  alarm  probabilities  independent  of  the  sample  size.  In  experiments  with  fixed  amounts  of 
data,  this  is  an  important  advantage. 

10.4.3  Effect  of  the  Choice  of  a  on  the  Threshold  Estimates 

As  was  mentioned  previously,  only  samples  whose  value  exceed  xq  are  used  in  estimating  the 
GPD  parameters.  The  value  of  Xo  is  determined  by  a.  The  results  presented  in  Tables  10.2-10.3 
were  obtained  by  means  of  Monte  Carlo  experiments  where  a  —  0.1  was  used  independent  of  the 
value  of  Pf  for  which  the  threshold  was  being  estimated.  When  the  false  alarm  probability  was 
extremely  small,  the  bias  and  root  mean  square  errors  were  quite  large  for  some  distributions. 
This  is  due  to  the  fact  that  the  GPD  is  intended  to  model  the  extreme  tail  of  the  underlying 
distribution.  The  smaller  the  value  of  a,  the  better  will  be  the  GPD  approximation  over  the 
extreme  tail  being  approximated.  When  a  is  chosen  too  large,  a  better  fit  is  found  for  that 
portion  of  the  distribution  closer  to  the  center  at  the  expense  of  lesser  accuracy  in  the  extreme 
tail.  Of  course,  there  is  a  tradeoff  between  the  choice  of  a  and  the  number  of  data  samples 
available  for  determining  the  parameters  of  the  GPD. 

In  our  application  the  major  objective  is  to  approximate  the  extreme  tails  corresponding  to 
thresholds  of  10“6  or  smaller.  Consequently,  we  explored  the  implications  of  selecting  values 
less  than  0.1  for  a.  To  accomplish  this,  we  obtained  the  theoretical  values  of  x<  for  the  stan¬ 
dard  Normal  and  Lognormal  distributions  corresponding  to  F-1(p,  where  p,  =  t=~  :  i=l,2,...n, 
and  n  =  1,000  and  10,000  respectively.  These  two  distributions  are  chosen  because  they  rep¬ 
resent  extremes:  The  Normal  distribution  is  light  tailed  while  the  Lognormal  is  a  heavy  tailed 
distribution. 

The  number  of  the  x,-  samples  used  to  determine  the  parameters  of  the  GPD  is  given  by 
an.  The  parameters  were  estimated  using  the  OSLS  procedure  for  values  of  a  equal  to  0.1, 
0.05  and  0.01.  The  resulting  GPDs  were  then  used  to  determine  the  thresholds  for  false  alarm 
probabilities  given  by  Pp  =  10~fc  where  k=2,3,...7.  These  results  are  presented  in  figure  10.6, 
where  both  the  theoretical  and  approximated  thresholds  are  plotted  as  a  function  of  k  for  (A) 
Normal  distribution  (n=10,000),  (B)  Normal  distribution  (n=1000),  (C)  Lognormal  distribution 
(n=10,000),  (D)  Lognormal  distribution  (n=1000).  For  k  >  5,  it  is  seen  that  a  =  0.01  (curve  b) 
appears  to  be  the  best  choice  for  approximating  the  thresholds.  The  best  results  were  obtained 
with  n  =  10,000.  However,  good  results  were  obtained  with  n  =  1,000. 
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thresholds 


5.5 


(A) 


Figure  10.6:  Normal  distribution,  n=10,000  Thresholds  for  PF  =  10"*.  Data  points 
correspond  to  k  -  2, 3, ....  7.  arlYue,  b:«=0.01,  c:a=0.05,  d:a=0.10. 


thresholds 


5.5 


Figure  10.6:  Normal  distribution,  n=1000  Thresholds  for  Pf  =  10-t.  Data  points  corre¬ 
spond  to  k  =  2, 3, ...,  7.  a:TVue,  b:a=0.Ol,  c:a=0.05,  d:a=0.10. 
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thresholds 


250 


Figure  10.6:  Lognormal  distribution,  n=10,000  Thresholds  for  Pp  =  10~*.  Data  points 
correspond  to  k  =  2,3,  a:True,  b:a=0.01,  c:a=0.05,  d:ar=0.10. 
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thresholds 


Figure  10.6:  Lognormal  distribution,  n=1000  Thresholds  for  Pp  =  10~*.  Data  points 
correspond  to  k  =  2, 3,...,  7.  arTYue,  b:a=0.01,  c:ar=0.05,  d:«=0.10. 
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10.5  Examples 

10.5.1  Known  Distribution  Case 

To  evaluate  the  accuracy  of  the  threshold  value  estimates,  10000  random  samples  were  gener¬ 
ated  from  the  Gaussian  and  Lognormal  distributions  and  the  upper  tails  of  these  two  distributions 
were  modeled  as  Generalized  Pareto.  In  sections  10.4.1  and  10.4.3,  theoretical  values  given  by 
X{  =  F~1(pi )  were  used  to  estimate  the  tail.  In  this  section  randomly  generated  samples  are 
used  in  place  of  the  theoretical  values.  Choosing  a  =  0.01,  the  theoretical  thresholds  of  the 
Gaussian  distribution  for  Pp  =  10~*  k  =  2, 3.. .7  are  2.326,  3.090,  3.719,  4.265,  4.753  and  5.199, 
respectively.  The  thresholds  estimated  are  2.315,  3.223,  3.847,  4.370,  4.855  and  5.292  .  For  the 
Lognormal  distribution  the  theoretical  thresholds  corresponding  to  Pf  =  10“fc  k  —  2, 3. ..7  are 
10.240,  21.982,  41.224,  71.157,  115.981  and  181.152.  Once  again,  using  a=0.01,  the  thresholds 
estimated  are  10.449,  22.862,  42.473,  69.216,  112.229  and  183.495.  Note  that  the  estimated  re¬ 
sults  are  very  close  to  the  true  thresholds.  We  note  here  that  these  results  were  obtained  on  the 
basis  of  one  set  of  observations  from  the  two  known  distributions,  corresponding  to  a  particular 
seed  value.  For  a  different  set  of  samples  the  estimates  will  be  different  depending  on  the  tail 
behavior  of  that  set  of  samples.  But,  unless  the  samples  are  really  not  a  true  representative  of  the 
distribution  from  which  they  are  drawn,  we  expect  that  the  estimates  based  on  different  samples 

should  give  threshold  values  that  yield  false  alarm  probabilities  close  to  the  design  value. 

10.5.2  An  Unknown  Distribution  Case 


In  the  previous  section  the  underlying  distributions  were  known  to  us  and  the  estimates  based 
on  the  extreme  value  theory  were  encouraging  for  both  light  and  heavy  tail  behavior.  In  this 
example,  we  take  a  non-Gaussian  problem  where  the  underlying  distribution  is  unknown. 

The  two  hypotheses  characterizing  the  detection  problem  are  given  in  equations  (9. 1-9.2).  We 
consider  the  weak  signal  case  for  which  the  clutter  is  much  stronger  than  the  background  noise. 
The  locally  optimum  detector  (LOD)  [74]  has  been  shown  to  be  suitable  for  the  weak  signal 
detection  problem.  Under  hypothesis  Hi,  the  signal  is  denoted  by  9s{,  where  0  is  a  measure  of 
the  signal  strength.  For  a  deterministic  signal  and  a  given  set  of  observations  r  =  [rj,  r2...,  r^]T 
the  LOD  performs  the  LRT 


L(t) 


££5^Uw!)|S  =  0 
dPswMHo)  >  ’ 


(10.53) 


where  Wr  m  is  the  joint  PDF  of  ri,r2,...r^  under  hypothesis  H,:  i=0,l. 

Martinez,  Swaszek  and  Thomas  [75]  studied  the  locally  optimal  detection  problem  for  non- 
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Gaussian  distributions  and  considered  the  bivariate  Laplace  distribution  as  an  example.  In  this 
section  we  illustrate  the  procedure  for  determining  the  thresholds  of  a  LOD  based  on  N=2  and 
the  received  samples  having  the  bivariate  Laplace  distribution  given  by 


/fi(ri)ra)  -  (10.54) 

where  M  is  the  covariance  matrix  for  the  two  samples,  \M\  denotes  its  determinant,  rTM~lr  is 
equal  to  (r\  —  2prir2  +  r3)/(l  —  p3),  p  is  the  correlation  coefficient  between  R\  and  R7  and  A'0(-) 
is  the  modified  Bessel  function  of  the  second  kind  of  zero  order.  The  resulting  locally  optimum 
detector  statistic  is  [75] 


Tlod{? i,?*) 


(^7l)1/2 


A'1[(2rrM-1r)1/3] 


rTM-'r’  7^0[(2rTM-1r)1/2] 


x  sTM~1r 


(10.55) 


wheres  =  (si,s2)r,  s3M_1r=  (r1-pr2)si  +  (r2-pr1)s2  and  7G(.)is  the  modified  Bessel  function 
of  the  second  kind  of  first  order,  si  and  s2  are  the  known  signal  levels.  In  this  example  we  take 
5i  =  1  and  32  =  —1.  Because  of  the  complexity  of  Tlod{-),  it  is  not  possible  to  determine  a 
closed  form  expression  for  its  probability  density  function. 

In  many  applications  in  radar,  thresholds  have  to  be  set  to  achieve  desired  false  alarm  proba¬ 
bilities  based  on  a  sample  size  which  is  orders  of  magnitude  less  than  10 /Pp.  As  will  be  pointed 
out  later,  the  statistic  in  equation  (10.55)  represents  a  worst  case  situation  in  the  sense  that  our 
simulations  indicate  that  the  variance  of  the  test  statistic  is  extremely  large.  To  investigate  the 
reliability  of  the  thresholds  estimated  based  on  extreme  value  theory  with  smaller  sample  sizes, 
10,000  pairs  of  observations  (ri,r2)  were  generated  from  the  bivariate  Laplace  distribution  given 
in  equation  (10.54),  with  p  =  0.90.  The  values  of  Tlod{? i,r2)  were  computed  for  each  pair  and 
sorted  in  increasing  order.  Corresponding  to  a  =  0.01,  the  largest  100  values  of  the  underlying 
statistic  (the  top  one  per  cent)  were  selected  to  fit  the  Generalized  Pareto  Distribution.  This 
experiment  was  repeated  250  times.  The  thresh  corresponding  to  a  certain  false  alarm  prob¬ 
ability  Pp  of  the  distribution  of  the  statistic  >  r2)  is  estimated  from  equation  (10.51) 

as  t\pf  =  x0  -f  d-[(^~E)-'*  —  l]/7  where  is  the  9900**  largest  value  of  the  statistic.  Thresh¬ 
olds  were  estimated  for  false  alarm  probabilities  Pp  =  10~\  k  =  2,...,  7  for  each  repetition  of 
the  experiment.  Histograms  of  these  threshold  values  are  shown  in  figure  10.7,  for  the  different 
Pps.  To  give  a  better  appreciation  for  the  range  of  values,  the  bins  are  not  necessarily  of  equal 
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width.  The  histograms  give  an  indication  of  the  spread  in  the  threshold  values  depending  on 
the  particular  samples  collected.  From  the  histograms  corresponding  to  false  alarm  probabilities 
of  10-3,  10'3  and  10_<  we  can  see  that  the  threshold  estimates  obtained  on  the  basis  of  even 
one  set  of  samples  is  likely  to  approximately  yield  the  desired  Pp.  Since  the  underlying  distribu¬ 
tion  of  Tlod(')  is  unknown,  one  measure  of  the  accuracy  of  the  estimate  is  the  extent  to  which 
most  of  the  estimates  fall  in  one  bin  of  the  histogram.  Also,  we  can  see  that  there  is  negligible 
overlap  between  the  estimated  threshold  values  in  the  histograms  for  the  three  different  Pps. 
This  supports  the  claim  that  the  estimated  threshold  is  likely  to  yield  a  false  alarm  probability 
which  is  of  the  same  order  as  the  desired  Pp.  There  is  a  higher  overlap  in  the  thresholds  of 
the  histograms  for  P/r=10"5,  10-6  and  10"7.  Also,  there  is  much  higher  spread  in  the  threshold 
values  estimated.  Based  on  the  excellent  results  obtained  for  the  same  choices  of  Pp s  in  the 
known  cases  of  the  previous  section,  these  results  are  surprising.  However,  it  is  explained  as 
follows.  The  7  values  of  the  GPD  estimated  for  the  different  repetitions  of  this  experiment  lie  in 
the  range  0.45  —  0.55.  This  represents  an  extremely  heavy  tailed  distribution.  From  Table  10.1 
we  see  that  the  Lognormal  distribution,  which  is  quite  a  heavy  tailed  distribution,  has  7=0.232. 
The  heavy  tailed  nature  of  the  detector  statistic  can  also  be  observed  by  comparing  the  large 
threshold  values  seen  in  the  histograms  with  the  corresponding  thresholds  of  the  Gaussian  and 
the  Lognormal  distributions.  The  variance  of  the  GPD  is  given  by 


Var(X) 


jrm -27)  7<0-5 

oo  7  >  0.5 


(10.56) 


Thus,  the  bivariate  Laplace  results  in  a  very  highly  fluctuating  statistic  with  an  extremely  large 
variance.  As  such,  it  represents  a  ‘worst  case’  situation  for  empirically  determining  the  thresh¬ 
old.  A  much  larger  sample  size  is  needed  to  obtain  reliable  threshold  estimates  because  of  the 
exceedingly  large  tail  of  the  underlying  distribution. 

In  general,  an  indication  of  how  heavy  the  true  tail  may  be  for  an  unknown  distribution  is 
given  by  the  estimate  of  7  for  the  GPD.  When  an  extremely  heavy  tail  <s  indicated,  another 
strategy  for  estimating  the  thresholds  when  Pp  is  very  small  is  to  choose  the  median  value  of  the 
thresholds  estimated  when  the  experiment  is  repeated  a  specified  number  of  times  with  10,000 
samples  in  each  repetition.  The  choice  of  the  median  as  the  estimator  ensures  that  very  'arge 
and  very  small  values  do  not  affect  the  results.  For  the  present  example,  we  chose  to  repeat  the 
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250  trials  three  times.  By  counting  the  number  of  estimates  that  fell  into  the  bins  centered  at 
20,  28  and  36  for  Pf=10-5,  40,  50,70  and  90  for  /,F=10-e  and  100  and  150  for  /V=10-7,  it  was 
found  that  88  percent  of  the  estimates  fell  into  these  bins.  Thus,  even  for  this  extremely  large 
tailed  example,  we  believe  that  use  of  the  GPD  has  allowed  us  to  estimate  useful  values  for  the 
thresholds  with  sample  sizes  much  smaller  than  10  jPp. 
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Figure  10.7:  Histograms  of  threshold  values.  (A)  Pf  =  1G  2  (B)  Pf  —  10  3  (C)  Pf 
lO"4  (D)  PF  =  10"5  (E)  PF  =  10~6  (F)  F*-  =  10~7 
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Figure  10.7:  Fig  10.7  Contd. 
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Figure  10.7:  Fig  10.7  Contd. 
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Figure  10.7:  Fig  10.7  Contd. 
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Figure  10.7:  Fig  10.7  Oontd. 
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Figure  10.7:  Fig  10.7  Contd. 


Chapter  11 

Performance  of  the  Locally  Optimum 
Detector  for  the  Multivariate 
Student-T  Distribution 


In  radar  problems  involvirg  weak  signal  applications,  it  is  found  that  the  large  returns  due  to 
clutter  can  lead  to  a  small  signal  to  disturbance  ratio.  The  large  returns  from  clutter  result 
when  the  density  function  of  the  clutter  exhibits  an  extended  tail  behavior.  Consequently,  the 
probability  density  function  of  the  disturbance  can  no  longer  be  modeled  as  Gaussian.  The 
significance  of  a  non-Gaussian  PDF  with  an  extended  tail  is  that  many  more  large  returns  result 
than  would  be  the  case  for  a  Gaussian  PDF  having  the  same  variance.  Hence,  there  is  a  need  to 
be  able  to  model  non-Gaussian  random  processes. 

The  multivariate  student-T  distribution  is  a  member  of  the  class  of  joint  PDFs  arising  from 
Spherically  Invariant  Random  Processes  (SIRP).  SIRPs  are  explained  in  detail  in  the  earlier 
chapters.  When  an  SIRP  is  sampled  at  N  instants  in  time,  the  resulting  vector  is  said  to  be 
spherically  invariant.  The  theory  of  SIRPs  offers  a  way  to  model  the  joint  density  function  on 
these  N  samples  where  the  correlation  between  the  individual  random  variables  in  the  vector  is 
accounted  for.  With  this  approach  locally  optimum  detector  structures  can  be  derived  for  non- 
Gaussian  disturbances  without  the  need  to  assume  that  the  random  variables  are  statistically 
independent.  In  this  chapter  we  analyze  the  performance  of  the  LOD  when  the  background 
disturbance  consisting  of  clutter  and  noise  can  be  modeled  as  having  a  multivariate  student-T 
distribution. 
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11.1  The  Multivariate  Student-T  Distribution 

A  convenient  procedure  for  generating  a  multivariate  student-T  distribution  is  discussed  in 
this  section.  Let  the  random  vector  X  have  a  multivariate  Gaussian  distribution  with  zero  mean 
and  covariance  matrix  M.  The  zero  mean  assumption  will  not  affect  the  generality  of  the  results 
that  follow.  The  joint  density  function  on  the  elements  of  X_  is  given  by 


fx(x)  = 


1 

(2jt)»|M|  ■/»* 


(11.1) 


where  the  vector  X  has  2 N  elements  from  N  inphase  and  N  quadrature  samples.  Consider  the 
vector  W_  =  XJ v,  where  u  is  a  nonnegative  random  variable  statistically  independent  of  X_-  Let 
ivTM~lw  be  denoted  by  the  variable  p.  Then,  the  conditional  density  function  of  the  vector  W 
given  v  can  be  written  as 


1 


The  unconditional  density  function  on  IV  is  given  by 


(11.2) 


Jroo 

'  fw{w\v)fu(v)dv 
o  “ 


0 1-3) 


where  fu{v)  is  the  probability  density  function  of  the  random  variable  v.  Because  X_  and  v  are 
statisticiilly  independent,  it  follows  that 


E(W)  =  E(=)  =  E(2QE(v~1)  =  0 
E(WWT)  =  E(X  Xt)E(v~2)  =  E(u~2)M. 


(11.4) 

(11.5) 


It  can  be  seen  from  the  above  equation  that  the  level  for  the  variance  of  the  elements  of  the 
vector  W_  can  be  adjusted  by  appropriate  choice  of  E(i/~2). 

With  respect  to  equation  (11.3),  let  /^(^)  be  the  generalized  chi  PDF  given  by 


fM  =  2 


lg  ou*  oft 

Wi 


(11.6) 


From  equation  (  11.6),  E(u  2)  can  be  calculated.  Specifically, 


(11.7) 
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Letting  at/2  =  x  in  the  above  equation  we  get 


E(i/~2)  =  a  / 
Jo 


eo  x0~2e~xdx  F(/?  - 1)  a 

”T09)“  "  a~m~  - 


(11.8) 


If  we  let  a  =  0  -  1,  then  the  generalized  chi  PDF  in  equation  (  11.6)  is  such  that  E(v~2)  =  1 
irrespective  of  the  choice  for  the  parameter  0.  Then  the  generalized  chi  PDF  takes  the  form 


*»• 


(11.9) 


In  general,  we  can  set  the  value  of  E(v~2)  to  a  desired  constant  C  by  choosing  a  =  C(0  —  1). 

Integrating  the  conditional  density  function  fwju i\v)  as  given  by  equation  (  11.2),  over  the 
PDF  of  the  nonnegative  random  variable  i/,  we  obtain  the  multivariate  student-T  distribution. 
The  details  are  given  below.  Choosing  a  =  0  —  1  in  equation  (  11.6)  we  can  write 


f  /  \  f°°  1  2N  =idz2v2l3-1e-^-1^(/S-  l)13  . 

L  (2tt)^|M|V2!/  e  3  r (0)  dv 

=  _ iE  ~  ]L _  r  2v2N+w-1t-viW-l+',Mdv 

Letting  (0  —  1  -f  p/2)v 2  =  y  we  get 

(0-iy  foo  yN+f3-\e-v 


(11.10) 


fw(w.)  = 


r 

Jo 


(2ir)N\M\l!2T(0)  Jo  (0-1+  pl2)N+P 
_  (0-iyT(N  +  0) 


dy 


(11.11) 


(2n)N\M\ll2T(0)(0  -  1  +  p/2)N^ ' 

The  above  expression  is  defined  to  be  the  2AT-dimensional  multivariate  student-T  distribution 
with  parameters  N  and  0.  N  represents  the  number  of  complex  samples  and  0  determines  the 
tail  behavior  of  the  multivariate  density  function. 

For  simulation  purposes,  the  density  function  in  equation  (119)  can  be  simulated  as  follows. 
The  first  step  is  to  generate  a  standard  Gamma  variate  from  the  density  function  fy(y)  = 

The  IMSL  package  was  used  to  the  generate  standard  Gamma  variates.  The  next  step  is  to  divide 
the  generated  random  variable  by  the  parameter  0—1.  Let  X  =  Y/(0  —  1).  The  density  function 
of  X  is 


m 


(11.12) 
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The  positive  square  root  of  results  in  the  desired  density  function.  Let  u  =  X^.  Therefore 
X  =  i/3.  Introducing  the  Jacobian  of  the  transformation,  the  density  function  of  v  becomes 


fM  “ 


2u20-ie-(0-i)v>(p  _  iy 

m 


(11.13) 


which  is  identical  to  that  in  equation  (  11.9). 

11.2  The  Locally  Optimum  Detector 

The  locally  optimum  detector  for  the  multivariate  student-t  distribution  can  now  be  deri’/ed. 
FYom  equation  (9.32)  the  locally  optimum  detector  is  given  as 


89  lg=°  > 
/d(l)  "o 


(11.14) 


Assuming  the  disturbance  can  be  modeled  by  a  multivariate  student-T  distribution,  /d(l)  is 
given  by  equation  (  11.11),  where  p  =  rrM-1r.  Since  equation  (11.14)  is  a  ratio  test  and 
all  constants  can  be  placed  in  the  threshold  which  is  determined  by  specifying  a  false  alarm 
probability,  all  multiplicative  constants  are  ignored  for  convenience.  Hence,  we  will  be  concerned 
only  with  the  terms  containing  the  variable  fj.  Excluding  the  constant  term  the  numerator  in 
the  ratio  test  is  given  by 


dfpjr  -  6s)  _  d_ _ 1 _ .. 

do  l,=°  SO 1  (0  -  1  +  p/2)N+» 11 


(11.15) 


Applying  the  chain  rule,  the  derivative  with  respect  to  6  can  be  expressed  as  the  derivative  with 
respect  to  p  times  the  derivative  of  p  with  respect  to  6.  The  derivative  of  p  with  respect  to  6  at 
6  —  0  can  be  derived  as 


^|s=o  =  (^(n-^)rAf"1(zi-^))ltf=o  =  — 2sTAf_1r.  (11.16) 

Therefore,  the  numerator  in  the  ratio  test,  excluding  the  onstant,  is  given  by 

a/g(r  -  _  (/J_  J  +  x  (11.17) 
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From  the  above  equation,  the  sufficient  statistic  for  the  locally  optimum  detector  for  the  multi¬ 
variate  student-T  distribution  can  be  written  as 

r“Dfc)  =  irrrm-  <1U8> 

The  above  result  for  the  LOD  statistic  is  very  significant.  The  numerator  in  equation  (11.18)  is 
recognized  as  the  Gaussian  linear  detector.  This  detector  is  a  matched  filter  which  maximizes 
the  signal- to-disturbance  ratio  whether  or  not  the  disturbance  is  Gaussian.  In  weak  signal  appli¬ 
cations  the  signal  to  disturbance  ratio  will  still  be  low  after  matched  filtering.  The  denominator 
of  the  LOD  statistic  is  the  nonlinear  term  in  the  statistic.  The  behavior  of  the  nonlinearity  is 
such  that  it  scales  down  large  values  of  p  and  enhances  small  values  of  p.  The  nonlinearity  is 
plotted  as  a  function  of  p  in  Fig.  11.1.  This  is  reasonable  because  large  values  of  radar  returns 
result  in  large  p  while  small  values  of  the  returns  yields  small  values  of  p.  Because  it  is  known 
a  priori  that  we  are  dealing  with  the  weak  signal  problem,  large  returns  cannot  be  due  to  the 
signal.  Consequently,  the  output  of  the  matched  filter  is  weighted  by  a  small  number.  On  the 
other  hand,  the  matched  filter  output  is  weighted  by  a  large  number  when  the  return  is  small 
and  the  contribution  due  to  the  signal,  if  present,  car  be  detected. 

11.3  Computer  Simulation  of  Performance 

The  performance  of  the  locally  optimum  detector  in  a  multivariate  student-T  distributed 
clutter  is  obtained  through  computer  simulations  for  weak  signal  applications.  For  simulation 
purposes  a  multivariate  student-T  distributed  disturbance  vector  D_  and  a  transmitted  signal 
vector  §_  have  to  be  generated.  The  first  step  in  generating  the  correlated  multivariate  student-T 
distributed  random  variables  is  to  generate  a  2,/V-dimensionaI  white  Gaussian  random  vector. 
Subroutine  DRNNOA  from  the  IMSL  package  is  used  to  generate  a  white  Gaussian  vector  of 
desired  dimension.  Each  element  of  the  white  Gaussian  vector  is  divided  by  the  random  variable 
generated  from  the  density  function  in  equation  (11.9).  This  results  in  a  white  student-T 
distributed  vector.  The  next  step  is  to  introduce  correlation  between  the  random  variables.  The 
covariance  matrix  of  the  clutter  process  is  assumed  known  with  unit  elements  along  the  diagonal. 
To  get  the  covariance  matrix  M  of  the  disturbance  we  add  a  small  number,  determined  by  the 
clutter  to  r.oise  ratio,  to  the  diagonal  elements  of  the  clutter  covariance  matrix.  This  serves  to 
limit  the  performance  of  the  receiver  even  where  the  clutter  power  is  negligible.  In  this  simulat  ion, 
the  clutter  to  noise  ratio  is  taken  to  be  80  dB.  Given  the  covariance  matrix, 
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Figure  11.1:  Nonlinearity  for  the  student-T  distribution. 
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a  cholesky  decomposition  is  carried  out  such  that  the  matrix  M  =  KKT  where  I(  is  a  lower 
triangular  matrix.  Multiplying  the  matrix  K  and  the  white  student-T  distributed  vector  we 
obtain  a  student-T  distributed  vector  with  the  desired  correlation  between  the  random  variables. 

The  autocorrelation  of  the  clutter  process  is  taken  to  be  a  geometric  function  in  this  problem. 
Assuming  radar  returns  from  clutter  cells  to  be  highly  correlated,  as  is  the  case  with  ground 
clutter,  the  sample  to  sample  correlation  is  taken  as  0.95  in  this  problem.  Specifically,  the 
sample  autocorrelation  function  is  chosen  as 

Rcc(n)  —  (0.95)n  n  =  0, 1,  ...,N  -  1  (11.19) 

where  Rcc(n )  is  the  discrete  time  autocorrelation  function  of  the  clutter  process.  Using  the  above 
function  the  elements  of  the  covariance  matrix  of  the  disturbance  can  be  filled  appropriately. 
The  elements  of  the  signal  vector  are  chosen  such  that  the  nih  element  Sn  =  ej2ir^°^n-UTJ  u  = 
1,2,...,  N.  fo  represents  the  Doppler  frequency  shift  of  the  received  signal  and  T  represents  the 
time  separation  between  sampling  instants 

The  detector  in  equation  (11.18)  is  now  simulated.  A  value  of  /?  =  1.5  for  the  multivariate 
student-T  distribution  is  chosen  because  this  value  results  in  a  relatively  long  tail  for  the  corre¬ 
sponding  marginal  PDF  of  one  element  of  the  vector.  By  evaluating  thresholds  for  specified  false 
alarm  probabilities,  the  student-T  distribution  was  seen  to  have  heavier  tails  than  the  Gaus¬ 
sian  distribution  for  false  alarm  probabilities  less  than  10-4  but  smaller  tails  than  the  Gaussian 
otherw  ise. 

The  thresholds  corresponding  to  false  alarm  probabilities  10~fc;  k  =  1,2, 3,4  are  obtained 
through  the  method  of  extreme  value  theory  explained  in  Chapter  10.  Once  the  threshold  is  set 
the  detection  probabilities  are  obtained  by  simulating  the  LOD  for  received  vectors  consisting 
of  the  sum  of  the  signal  and  disturbance  vectors  for  various  signal-to-disturbance  ratios.  The 
value  of  fo  is  chosen  to  be  zero  in  this  simulation.  The  number  of  trials  in  the  Monte  Carlo 
simulation  for  each  case  is  equal  to  10,000.  The  performance  of  the  LOD  is  compared  to  that  of 
the  Gaussian  detector  for  the  same  multivariate  student-T  distributed  clutter.  The  test  stat  istic 
for  the  Gaussian  detector  is  the  same  as  the  numerator  of  the  LOD,  which  is  sTM~lr.  The 
results  are  shown  in  Tables  11.1-11.14. 

From  the  tables  it  can  be  seen  that,  when  the  false  alarm  probability  is  10-1,  the  LOD  and 
the  Gaussian  receiver  have  comparable  performances  for  the  various  signal  to  clutter  ratios  con- 
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sidered.  For  lower  false  alarm  probabilities,  the  LOD  always  outperforms  the  Gaussian  receiver 
except  for  the  zero  dB  entries  in  Tables  11.1,  11.4,  11.7,  11.11  and  11.12.  The  difference  is 
especially  significant  for  false  alarm  probabilities  equal  to  10-3  and  10_<4. 

From  our  computer  simulations  we  expect  that  the  performance  improvement  of  the  LOD 
over  the  linear  Gaussian  receiver  depends  on  the  shape  of  the  tail  disturbance  PDF.  The  heavier 
the  tail  of  the  disturbance  PDF,  the  better  is  the  expected  improvement  in  performance.  The 
student-T  distribution,  while  being  heavier  tailed  than  the  Gaussian,  is  not  as  heavy  tailed  as 
the  K-distribution  and  Weibull  distribution.  In  fact,  the  student-T  distribution  may  not  be  a 
likely  candidate  for  modeling  the  radar  disturbance.  The  student-T  distribution  was  chosen  as 
the  first  distribution  to  be  studied  only  because  of  the  mathematical  simplicity  and  well  behaved 
nature  of  its  multivariate  PDF.  Nevertheless,  the  analysis  done  with  the  student-T  distribution 
confirms  that  the  LOD  outperforms  the  Gaussian  receiver  for  weak  signal  applications. 

11.4  Conclusions 

It  can  be  observed  from  the  tables  that  the  Gaussian  receiver  performance  degrades  abruptly  for 
false  alarm  probabilities  less  than  or  equal  to  10”2  whereas  the  LOD  shows  a  gentler  degradat  ion 
in  performance.  Both  the  receivers  show  an  improvement  in  performance  as  the  number  of 
samples  is  increased.  However,  the  LOD  shows  a  dramatic  improvement  in  performance  when 
the  sample  size  is  greater  than  64.  From  Table  11.14,  it  can  be  seen  that  for  SCR=0  dB  and 
Pp  =  1 0— 4 ,  the  detection  probability  for  the  LOD  is  0.3720  while  that  for  the  Gaussian  receiver 
is  0.0003.  This  represents  an  improvement  factor  in  the  vicinity  of  3  orders  of  magnitude  for  the 
LOD.  Also,  from  Tables  11.3,  11.6,  11.9  and  11.13,  we  observe  that  when  the  Pp  is  set  to  10”3  the 
LOD  shows  a  performance  improvement  of  two  orders  of  magnitude  compared  to  the  Gaussian 
receiver.  For  larger  sample  sizes  eg.  64,  128  the  detection  probability  of  the  LOD  is  in  the  tenths 
for  SCR=-10dB  and  Pp  =  10-2,  while  for  the  Gaussian  receiver  it  is  in  the  hundredths.  Overall, 
when  Pp  is  less  than  or  equal  to  10“2,  the  Gaussian  receiver  requires  a  signal-to-clutter  ratio 
10-20  dB  larger  than  that  required  by  the  LOD  for  the  same  values  of  Pp  and  Pd- 

The  LOD  does  not  work  well  if  the  signal  to  clutter  ratio  is  too  large.  The  performance  degrades 
rapidly  for  signal  to  clutter  ratios  exceeding  zero  dB.  The  LOD  is  designed  for  detecting  targets 
when  it  is  known  that  that  the  signal  is  weak.  The  aim  of  using  a  LOD  is  to  obtain  detection  in 
range- Doppler-azimuth  cells  where  conventional  space-time  processing  does  not  help  in  getting 
acceptable  performance.  These  cells  are  now  ignored  because  it  is  felt  that  they  are  hopeless  for 
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target  detection  purposes.  The  nonlinearity  present  in  the  LOD  plays  the  role  of  suppressing 
large  returns.  However,  if  the  SCR  is  high,  the  large  returns  are  more  likely  to  be  caused  due 
the  signal  than  due  to  the  clutter.  Hence,  the  detection  performance  will  drop  off  compared  to 
the  Gaussian  receiver.  In  general,  when  the  SCR  is  relatively  high  (£0  dR)  the  likelihood  ratio 
test  is  the  optimal  test  for  target  detection  under  a  fixed  false  alarm  constraint.  When  the  signal 
to  clutter  ratio  becomes  very  close  to  zero,  the  LOD  receiver  will  hardly  3how  any  detections 
even  though  it  would  still  outperform  the  Gaussian  receiver.  This  is  because  the  PDFs  under 
Ho  and  Hi  are  so  close  to  each  other  that  it  is  impossible  to  separate  them  without  increasing 
the  sample  size  by  orders  of  magnitude. 

The  concept  of  spherically  invariant  random  processes  and  locally  optimum  detectors  are 
particularly  relevant  in  the  context  of  modern  radar  applications.  When  the  radar  scans  a 
volume  searching  for  targets  there  might  be  certain  regions  in  the  volume  where  the  clutter 
returns  are  so  strong  that  signal  returns  get  blanked  out.  It  is  in  these  regions  that  we  can 
obtain  detections  with  LODs.  There  is  a  need  to  monitor  the  environment  so  that  we  are  able 
to  separate  the  clutter  regions  from  volumes  that  are  just  limited  by  background  noise.  When 
detections  are  limited  by  background  noise  alone,  LODs  are  inapplicable.  In  this  research  effort 
work  is  beginning  in  the  area  of  using  artificial  intelligence  (AI)  for  monitoring  the  volume.  Using 
AI,  clutter  patches  can  be  identified  and  the  underlying  multivariate  PDF  of  the  clutter  returns 
can  be  approximated  using  the  library  of  SIRPs  that  have  been  developed.  From  the  library  of 
LODs  the  LOD  corresponding  to  the  approximated  SIRP  can  be  used  in  clutter  regions  to  obtain 
detections  if  the  target  is  present,  where  earlier  it  would  not  have  been  possible. 
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SCR 

LOD 

GE 

eras 

Pd 

0.7047 

0.8600 

•10  dB 

Pd 

0.3220 

0.2800 

-20  dB 

Pd 

0.1611 

0.1460 

•30  dB 

Pd 

0.1176 

0.1190 

Table  11.1:  Sample  Size=16,  Pp  ~  10” 1 ,  SCRiSignal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GRiGauttsian  Receiver 


SCR 

LOD 

GR 

0  dB 

Pd 

0.3761 

0.1050 

-10  dB 

Pd 

0.0838 

0.0190 

-20  dB 

Pd 

0.0246 

0.0120 

-30  dB 

Pd 

0.0141 

0.0.0100 

Table  11.2:  Sample  Size- 16,  Pp  =  10~2,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GRiGaussian  Receiver 


SCR 

LOD 

GR 

0  dB 

Pd 

0.1604 

0.00.30 

-10  dB 

Pd 

0.0198 

0.0014 

-20  dB 

Pd 

0.0027 

0.0011 

-30  dB 

Pd 

0.0012 

0.0  0001 

Table  11.3:  Sample  Size-16,  Pp  —  10“3,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GR;Gaussian  Receiver 


SCR 

LOD 

GR 

0  dB 

Pd 

0.7607 

0-9090 

-10  dB 

Pd 

0.3608 

0.3200 

-20  dB 

Pd 

0.1704 

0.1540 

-30  uB 

Pd 

0.1202 

0.0.1190 

Table  11.4:  Sample  Size=3*2,  Pp  =  10” 1 ,  SCRiSignal  to  Clutter  Ratio,  LODiLocally 
Optimum  Detector,  GR:Gau3sian  Receiver 
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ms&M 

■WIT 

mmnm 

Pd 

0.4573 

0.1750 

BmP  U 

Pd 

0.1052 

0.0220 

1 

Pd 

0.0255 

0.0130 

Pd 

0.0145 

0.0120 

Table  11,5:  Sample  Size=32,  Pp  —  10“ SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GRiGaussian  Receiver 


SCR 

LOD 

GR 

OdB 

Pd 

0.2621 

0.0035 

-10  dB 

Pd 

0.0289 

0.0015 

-20  dB 

Pd 

0.0042 

0.0012 

-30  dB 

Pd 

0.0013 

0.0001 

Table  11.6:  Sample  Size=32,  Pp  =  10“3,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GR.Gaussian  Receiver 


SCR 

LOD 

GR 

0  dB 

Pd 

0.8117 

0.9510 

-10  dB 

Pd 

0.4302 

0.3790 

-20  dB 

Pd 

0.1278 

0.1590 

-30  dB 

Pd 

0.1252 

0.1195 

Table  11.7:  Sample  Size=64,  Pp  =  10“\  SCRrSignal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GR:Gaussian  Receiver 


SCR 

LOD 

GR  | 

Pd 

0.5484 

mmm 

Pd 

0.1446 

-20  dB 

Pd 

0.0301 

EEEl 

Pd 

0.0152 

0.0010 

Table  11.8:  Sample  Size=64,  Pp  =  1Q“2,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GR.Gaussian  Receiver 
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SCR 

lOd 

GR 

O'dB 
-10  dB 
-20  dB 
-30  dB 

Pd 

Pd 

Pd 

Pd 

0.3643 

0.0492 

0.0057 

0.0019 

0.0048 

0.0016 

0.0012 

0.0001 

Table  11.9:  Sample  Size=64,  Pp  =  10"3,  SCRiSignal  to  Clutter  Ratio,  LODiLocally 
Optimum  Detector,  GRiGaussian  Receiver 


LOD 

GR 

■pVjpV 

Pd 

0.0002 

Pd 

0.0001 

Pd 

QR  i  *  J 

0.0001 

Pd 

0.0000 

Table  11.10:  Sample  size=64,  Pp  =  10”4,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GRiGaussian  Receiver 


SCR 

LOD 

GR 

0  dB 

Pd 

0.8517 

K2 ill 

Pd 

0.4987 

Pd 

EXE, 3 

Pd 

0.1314 

0.1186 

BBl 

Pd 

Table  11.11:  Sample  Size=128,  Pp  —  10" l,  SCRiSignal  to  Clutter  Ratio,  LODiLocally 
Optimum  Detector,  GRiGaussian  Receiver 


SCR 

LOD 

GR 

0  dB 

Pd 

0.6511 

0.7050 

-10  dB 

Pd 

0.2190 

0.0320 

-20  dB 

Pd 

0.0445 

0.0150 

-30  dB 

Pd 

0.0198 

0.0116 

-40  dB 

Pd 

0.0147 

0.0010 

Table  11.12:  Sample  Size=128,  Pp  =  10'2,  SCRiSignal  to  Clutter  Ratio,  LODiLocally 
Optimum  Detector,  GRiGaussian  Receiver 
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SCR 

l6d 

6r 

OdB 

Pd 

0.4777 

0.0090 

-10  dB 

Pd 

0.0869 

0.0020 

-20  dB 

Pd 

0.0098 

0.0013 

-30  dB 

Pd 

0.0037 

0.0011 

-40  dB 

Pd 

0.0021 

0.0001 

Table  11.13:  Sample  Size=128,  PF  -  10“3,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GR:Gaussian  Receiver 


SCR 

LOD 

GR 

OdB 

Pd 

0.3720 

0.0003 

-10  dB 

Pd 

0.0430 

0.0002 

-20  dB 

Pd 

0.0039 

0.0001 

-30  dB 

Pd 

0.0007 

0.0001 

-40  dB 

Pd 

0.0003 

0.0000 

Table  11.14:  Sample  Size=128,  PF  =  10-4,  SCR:Signal  to  Clutter  Ratio,  LOD:Locally 
Optimum  Detector,  GR.Gaussian  Receiver 
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Appendix  A 

Properties  of  SIRVs 


In  this  appendix  we  present  some  original  proofs  for  properties  of  SIRPs  stated  in  the  literature. 

A.l  Statistical  Independence 

An  SSRV  X  =  pfi,  As,  . . . ,  X/v]T  results  in  statistical  independence  of  the  A,- 1  »  1, 2, . . . ,  N 
if  and  only  if  the  SSRV  is  Gaussian. 

Proof:  Recall  that  the  PDF  of  X  can  be  expressed  as 

/x(x)  =  khN[(x \  +  x\  +  . . .  + 1^)1]  =  (2tt)-^  h,N(y/jfx).  (A.l) 

If  the  components  of  X  are  statistically  independent,  then  the  PDF  given  by  eq  (A.l)  must  factor 
into  the  product  of  the  marginal  PDFs  of  the  components  of  X.  It  then  follows  that 

N 

M(*i  +  x]  +  . . .  +  a;^)*]  =  JJ g(xi).  ( A.2) 

1  =  1 

Letting  r  =  (zj  4-  x\  +  . . .  +  x\)\  and  differentiating  both  sides  of  eq  (A.2)  with  respect  to  x„ 
results  in 

x.  ,  N 

“Mr)  =  n  9(xj)g(xi).  (A. 3) 

j  =  1 

j  ±  i 
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Dividing  both  sides  of  eq  (A.3)  by  *,7»\(r)  results  in 


j&fcL  =  g'M 

rA^(r)  Xigfa)' 


(A.4) 


Equality  holds  in  eq  (A.4)  if  and  only  if  the  left  and  right  sides  of  eq  (A.4)  are  equal  to  the  same 
constant.  Denoting  this  constant  by  —A,  we  have 


!±n (r)  _  _A 

rhN(r)~ 

Integrating  both  sides  of  eq  (A. 5)  with  respect  to  r  gives 

Ar2 

hN(r)  ~  aexp{ — — ) 


(A.5) 


(A.6) 


where  a  is  the  constant  of  integration.  Hence, 


hs[(xl  +  x\  +  . . .  +  a;jv)*]  =  aexp[--(x\  +  x\  +  . . .  +  zjv)] 


(A. 7) 


Substitution  of  eq  (A. 7)  in  eq  (A.l)clearly  results  in  the  Gaussian  PDF.  The  constraint  of  unity 
volume  under  the  PDF  results  in  a  =  A'^. 

In  order  to  prove  the  sufficient  part  of  the  property,  we  start  with  the  marginal  PDFs  of  the 
components  of  X  given  by 

fXi(xi)  =  (y)-*ea:p(-^r).  (A.8) 

Under  the  assumption  of  statistical  independence,  we  obtain  the  PDF  of  X  by  taking  the  product 
of  the  marginal  PDFs  of  its  components  as 

/x(x)  =  (y)"^carp(-^53*?).  (A.9) 

Clearly  the  PDF  given  by  eq  (A.9)  is  of  the  form  of  eq  (A.l).  Hence,  the  sufficient  part  of  the 
property  follows. 

An  alternate  proof  of  this  property  can  be  obtained  by  using  the  representation  theorem.  The 
representation  theorem  allows  us  to  express  the  SSRV  X  as  a  product  of  a  Gaussian  random 
vector  Z  having  zero  mean  and  identity  covariance  matrix  and  a  non-negative  random  variable 
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S.  More  precisely,  we  can  write 

X  -  ZS.  (A. 10) 

The  components  of  X  can  he  statistically  independent  if  and  only  if  S  is  a  constant.  When  S  is 
a  constant,  X  is  a  Gaussian  SSRV.  As  is  often  the  case,  the  representation  theorem  provides  a 
simplified  approach  for  determining  properties  of  SIRVs. 

A. 2  Spherically  Symmetric  Characteristic  function 

In  this  section,  we  prove  that  the  characteristic  function  of  an  SSRV  is  spherically  symmetric. 
Proof:  We  consider  the  SSRV  X  =  (Xj,  Xa,  .  ..,X/y]r.  From  the  representation  theorem, 
we  can  write  X.  =  ZS  where  Z  is  a  Gaussian  random  vector  having  zero  mean  and  identity 
covariance  matrix  and  S'  is  a  non-negative  random  variable  with  PDF  /s(s).  The  characteristic 
function  of  X  given  by 

$x(w)  =  E[exp(ju>T  X.)]  (A. 11) 

where  u)  =  [wi,  u>a, . . .  ,wa  ]t  can  be  expressed  as 

=  £s[$X|s=*(“>)]  (A. 12) 

where  $X|S=»(W)  =  E[exp(jujTZs)].  However, 

E[exp(ju>TZs)]  =  (A. 13) 

z  .=i 

Using  eq  (A. 13)  in  eq  (A. 12)  results  in 

2  N 

$x(w)  =  /  ^p(-~J2^)fs{s)ds.  (A. 14) 

Jo  l  l=1 

The  characteristic  function  given  by  eq  (A.  14)  can  be  expressed  as  a  function  of  VwTw.  Hence 
it  is  spherically  symmetric. 

A.3  Relationship  Between  Higher  Order  and  Lower  Order  SIRV 
PDFs 

In  this  section  we  examine  the  relationship  between  the  higher  order  and  lower  order  SIRV 
PDFs.  More  precisely  we  consider  an  SIRV  Y  =  [Vi,  Yi,  . . . ,  VjvjT  having  mean  vector  //,  covari- 
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ance  matrix  £  and  characteristic  PDF  /s(a).  The  PDF  of  Y  is  given  by 

My)  =  (2*)-*  |E|-*Mp)  (a. is) 

where  p  =  (y  -  p)TE-l(y  -  p)  and 

Mp)  =  l  a"A'e«p(-~j)/s(s)</j».  (A. 16) 

The  vector  Y  can  be  partitioned  ae  Y  =  [Yir  YaT]r  wheie  Yi  =*  [hi,  Y2,  ...Pin]7,  and 
Y2  =  [Km+ii  Tm+a,  •  •  •  Let  pi  and  pa  denote  the  mean  vectors  of  Yi  and  Ya  respectively, 
and  £1  and  Ea  denote  the  corresponding  covariance  matrices.  We  need  to  obtain  the  PDF  of 
Yi  from  the  PDF  of  Y  by  integrating  out  over  the  N  —  m  random  variables  (i.e.,  the  components 
of  Ya).  Let  pi  ==  (yi  -  pi^E^yi  -  pi)  and  p,  =  (y2  -  P2)T££1(y2  -  P2).  The  PDF  of  Yi 
is  given  by 

/Y,(y>)  -  (2irr<f|E|-)  jT  f  s-Nexp(-^)fs(,)d3JY2.  (A.17) 

From  [38]  (pi 7  eq.8,  pi 8  eq.ll)  we  have 

(27r)-^|£|-i^ea:p(-~)rfY2  =  (27r)-^|£i|"b/v-mcxp(-^).  (A. 18) 

Using  eq  (A. 18)  in  eq  (A.17)  gives 

/Yj(yi)  -  (27r)-?|£i|“i  jf  s-mexp(—^)fs(s)ds.  (A. 19) 

The  PDF  of  Yi  can  be  expressed  as 

/Yx(yi)  =  (2?r)"^|£i|~Um(pi)  (A. 20) 

where 

hm(pi)  =  Jo  s~mexp(-~)fs(s)ds.  (A.21) 

Clearly,  hm(pj)  given  by  eq  (A.21)  can  be  obtained  from  eq  (A. 16)  by  simply  replacing  N  by  m 
and  p  by  pi.  To  determine  the  PDF  of  Yi,  all  that  is  needed  is  the  specification  of  its  mean 
vector  and  covariance  matrix.  As  a  special  case,  when  m  =  1,  eq  (A.  19)  gives  us  the  first  order 
SIRV  PDF.  Therefore,  to  obtain  the  first  order  SIRV  PDF  of  the  ith  component  of  Y  starting 
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from  the  Nth  order  SIRV  PDF,  we  simply  use  eq  (A.  19)  with  m  =  1,  Si 


<r,  and  pi  = 
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Appendix  B 

Computer  Generation  of  SIRVs  Using 
the  Rejection  Method 


B.l  Rejection  Method 


We  present  a  proof  of  the  rejection  procedure  [42]  used  for  generating  the  norm  R  of  the 
white  SIRV  X  in  Chapter  4.  In  many  instances,  it  is  likely  that  the  PDF  of  a  random  variable  is 
known  explicitly,  but  its  cumulative  distribution  function  is  either  unknown  or  has  a  complicated 
functional  form.  Consequently,  the  cumulative  distribution  function  cannot  be  inverted  easily. 
Therefore,  the  use  of  the  inverse  distribution  function  for  generating  the  random  variable  does 
not  offer  a  practical  solution  for  this  problem.  Hence,  it  is  necessary  to  use  a  different  scheme  for 
generating  the  random  variable.  We  consider  the  problem  of  generating  a  sequence  of  random 
numbers  with  PDF  /«(r)  of  a  random  variable  R,  in  terms  of  a  random  number  sequence  with 
PDF  fui(ui)  of  a  random  variable  U\.  The  underlying  assumption  is  that  the  random  number 
sequence  from  the  PDF  of  U\  can  be  readily  generated. 

The  rejection  method  used  in  Chapter  4  is  based  on  the  relative  frequency  interpretation  of 
the  conditional  PDF 


ful(ui\M)dui  = 


P{u\  <  U\  <  ui  -f  du\ ,  M  } 


(B.l) 


P(M ) 

of  a  random  variable  U\  given  the  event  M.  M  is  expressed  in  terms  of  the  random  variable  Ui 
and  another  random  variable  f/2  and  is  chosen  so  that  the  resulting  conditional  PDF  /i/,(tii|A4) 
equals  /n(r).  The  desired  sequence  is  generated  by  setting  R  =  U\  given  that  the  event  M  has 
occurred  and  rejecting  U\  otherwise.  The  problem  has  a  solution  only  if  the  domains  of  r  and  ut 
are  such  that  fn{r)  =  0  in  every  interval  for  which  /i/,(ui)  =  0.  Therefore,  we  can  assume  that 
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the  ratio 


is  bounded  from  below  by  some  positive  constant  a: 


/r(«  i) 


>  a  >  0  for  every  ttj 


(B.2) 


B.2  Rejection  Theorem 

It  is  desired  to  generate  a  random  variable  R  with  PDF  /«(r).  Let  U\  be  any  random  variable 
with  PDF  such  that  /j/,(u i)  =  0  whenever  /fl(r)  =  0.  Let  U2  be  a  uniformly  distributed 

random  variable  on  the  interval  (0, 1).  If  the  random  variables  U\  and  U2  are  statistically  inde¬ 
pendent  and 

M  =  {U2<g{Ux)}  (B.3) 


where 


then 


s(ui)  =  aMM  ~ 


fux(u  i\M)  =  /r(ui). 


(B.4) 

(B.5) 


Proof:  The  joint  PDF  of  the  random  variables  U\  and  U2  can  be  written  as  fui,u2(uiiui)  = 
fui(ui)fu2(^2)i  since  U\  and  t/2  are  statistically  independent.  Hence,  we  have 

/oo  rg(u  1) 

/  fulMfu2(u2)duidu2.  (B.6) 

00  »/o 


However,  since  U2  is  uniformly  distributed  in  the  interval  (0, 1)  and  g(u\)  <  1, 


rg(u 1 ) 

/  fu2{u2)du2- g{vx). 

Jo 


(B.7) 


Using  eq  (B.7)  in  eq  (B.6)  gives 


P(M)=f  g(ui)fux(ui)dui. 

J -OO 

However,  </(ui)  =  a  •  Therefore,  we  have 

/oo 

fR(ui)dux  =  a . 


(B.8) 


(B.9) 
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We  can  express  the  numerator  of  eq  (B.l)  as 

fQ(u  i) 

<Ui<U\  +  dux,M)  =  /  fui (ui)fua(tij)duidu2 

Jo 

Using  eqs  (B.9)  and  (B.10)  in  eq  (B.l)  results  in  eq  (B.5). 

Thus,  we  have  the  following  algorithm  for  generating  the 
the  PDF  of  R. 

1.  Generate  U\  and  U2. 

2-  If  U,  <  then  U,  =  R 

3.  Otherwise  reject  U\. 

With  reference  to  the  generation  of  the  norm  R  in  Chapter  4,  U\  and  U2  were  uniformly 
distributed  random  variables.  Let  c  denote  the  maximum  value  of  the  PDF  of  R  and  b  denote  a 
finite  range  for  the  PDF  of  R  such  that  the  area  under  the  curve  of  the  PDF  is  close  to  unity. 
Ui  is  assumed  to  be  uniformly  distributed  in  the  interval  (0, 6).  Clearly,  >  £.  Hence, 

<  1*  Therefore,  a  =  Step  2  above  becomes:  If  U2  <  then  Ui  =  R. 

This  can  be  rewritten  as:  If  cU2  <  /ft(uj),  then  Ui  =  R.  For  ease  of  implementation,  this  latter 
form  is  used  in  conjunction  with  a  uniform  random  variable  U2  that  is  uniformly  distributed  over 
the  interval  (0,  c).  This  is  the  procedure  followed  in  Chapter  4. 

The  method  used  in  Chapter  4  becomes  inefficient  if  Ui  is  rejected  frequently  in  step  3,  resulting 
in  the  necessity  to  generate  the  two  uniformly  distributed  random  variables  of  step  1  an  inordinate 
number  of  times.  This  problem  can  be  overcome  by  using  for  U\  a  PDF  which  bounds  the  PDF 
of  R  and  satisfies  the  conditions  stated  in  section  B.l  and  in  the  rejection  theorem.  Then  a 
random  variable  from  this  PDF  is  used  in  step  1  instead  of  the  uniform  random  variable  U\ . 

A  second  drawback  of  using  a  uniformly  distributed  random  variable  U\  is  that  it  may  not  be 
possible  to  efficiently  generate  SIRVs  of  length  greater  than  8.  This  is  due  to  the  fact  that  the 
PDF  of  R  depends  on  N.  Consequently,  the  uniform  distribution  for  U\  may  not  satisfactorily 
bound  the  PDF  of  the  norm  R  for  all  N.  This  drawback  can  be  overcome  by  choosing  a  diffeient 
PDF  for  U\  for  each  choice  of  N,  such  that  the  conditions  stated  in  section  B.l  and  in  the 
rejection  theorem  are  satisfied.  This  method  would  require  the  use  of  an  exhaustive  table  which 
tabulates  the  appropriate  PDF  of  U\  for  each  desired  value  of  N . 


=  <7(Mi)/i/,(ui)dvi  =  afR(ui)dui. 

(B.10) 


sequence  of  random  numbers  from 
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Finally,  it  is  pointed  out  that  by  using  a  composite  function  for  the  PDF  of  U\.  it  is  possible 
to  improve  the  simulation  procedure  in  terms  of  being  able  to  generate  random  numbers  from 
the  body  and  the  tail  of  the  PDF  of  R.  These  issues  are  suitable  topics  for  future  investigation 
as  an  extension  of  this  work. 
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Appendix  C 


C.l  Limiting  Forms  for  the  Largest  Order  Statistic 

Let  Xi  <  Xi  <  ...  <  Xn  be  the  ordered  statistics  of  n  random  variables  having  a  common 
distribution  function  F(x).  Assuming  that  the  trials  of  drawing  the  random  variables  from  the 
distribution  function  F(x)  are  independent,  the  distribution  function  of  the  largest  order  statistic 
Xn  is  given  by 


P(Xn  <  x)  =  P(X1<x,X2<x,...,Xn<x) 

=  F"(»).  (C.l) 

When  F  is  continuous  but  unknown,  an  asymptotic  theory  is  developed  for  F  in  the  range  0+  to 
1_  [68].  It  is  shown  that  positive  sequences  {an}  and  {6„}  exist  such  that 

lim  Pf^-Zh.  <  *)  =  iim  P(Xn  <  anx  +  bn )  — ►  A(x)  (C.2) 

n-*oo  v  ft  —  /  n-^oo  v  "™  ' 

un 

or  equivalently,  by  means  of  equation  (C.l),  that 

lim  Fn(anx  +  bn)  -  A(*).  (C.3) 

Let  n  =  md  in  equation  (C.3).  d  is  a  fixed  positive  constant  so  that  as  n  — ►  oo,  m  — >  oo.  Using 
the  fact  that  n  =  md,  we  can  write 

lim  Fmd(amdx  +  bmd)  =  lim  Fn(anx  +  6„)  -♦  A(x).  (C.4) 

m— ►  oo  n— mx> 


It  is  also  true  that 


lim  [Fm(amx  +  bm)]d  =  lim  Fmd(amx  +  bm)  -*•  A^z). 

m—+oo 1  '  m— ►  oo  '  '  7 


(0.5) 
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If  equations  (C.4)  and  (C.5)  hold,  then  from  a  theorem  of  Hintchin  [76],  there  exist  numbers 
Ad  >  0  and  Bd  >  0  such  that 


A  d(Adx  +  Bd)  =  A(x) 


(C.6) 


for  all  integer  values  of  d. 

Solution  of  the  above  functional  equation  yields  all  the  possible  limiting  forms  for  the  distribu¬ 
tion  function  Fn(x).  The  constant  Ad  may  or  may  not  be  unity.  If  it  is  unity,  then  the  functional 
equation  to  be  solved  is  given  by 

Ad(x  -(-  Bd)  -  A(x).  (C.7) 


On  the  other  hand,  if  Ad  is  not  unity,  the  form  of  equation  (C.6)  stands  and  there  exists  a  value 
xod  —  Bd/(  1  —  Ad)  such  that 

Ad(xod)  ~  A(xod).  (C.8) 


Constraining  the  solution  to  the  above  equation  to  be  real  and  nonnegative,  the  solution  is  either 
A  =  0  or  1.  However,  because  A(x)  is  a  distribution  function  the  value  of  A  can.  be  0  only  if  xod 
is  the  lower  endpoint  at  which  A(xod)  =  0+  and  A  can  be  1  only  if  xod  is  the  upper  end  at  which 
A(xod)  =  1--  Since  Ad  and  Bd  are  assumed  to  be  finite,  xod  must  also  be  finite.  Consequently, 
there  is  no  loss  in  generality  by  assuming  that  the  endpoint  of  interest  is  located  at  the  origin 
(i.e.,  xod  —  0).  When  Ad  ^  1,  note  that  xod  =  0  implies  Bd  =  0.  As  a  result,  the  solutions  for 
equation  (C.6)  fall  into  three  cases  which  are  given  below. 


1) 

Ad(x  +  Bd)  =  A(x) 

Ad  =  1 

(C.9) 

2) 

A  d(Adx)  =  A(x) 

Ad  ^  1  F  —  0  when  x  —  0 

(C.10) 

3) 

A  d(Adx)  =  A(x) 

Ad  /  1  F  —  1  when  x  =  0 

(C.ll) 

C.1.1  Case  1 

Case  (1)  of  equation  (C.9)  is  solved  as  follows.  Taking  the  logarithm,  we  have 

log  A(x)  =  d  log  A(x  +  Bd).  (C.12) 

Multiplying  through  by  a  minus  sign  and  taking  the  logarithm  of  both  sides,  we  obtain 


log[-log  A(x)]  =  log  d  +  log[-log  A(x  +  Bd)\. 


(C.  13) 
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For  simplicity,  kt 

g(x)  =  log[-log  A,(w)]. 

(C.14) 

Then  equation  (C.13)  becomes 

g(x)  —  log  d  +  g(x  +  Bd). 

(C.15) 

Equivalently, 

g(x  -  Bd)  -  log  d-r  g(x) 

(C.16) 

or 

g(x)  ~  g(x  -  Bd)  -  log  d. 

(C.17) 

Adding  equations  (C.15)  and 

(C.17),  we  obtain 

g(x  +  Bd)  +  g(x  —  3d)  =  2g(x). 

(C.18) 

The  above  equation  is  valid  for  all  x  if  and  only  if  g(x)  is  linear  in  x.  Specifically,  let 

g(x)  =  kx  +  j  (C.19) 

where  j  and  k  are  constants  Then 

g(x  +  Bd)  =  k(x  +  Bd)  +  j  =  g(x)  -  log  d  =  kx  +  j  -  log  d.  (C.20) 

It  follows  that 

kBd  =  -log  d  or  k  -  —  ^  .  (C.21) 

tfd 

Substituting  equation  (C.21)  in  equation  (C.19),  we  see  that 

/  .  x  log  d 

9{x) +  ——=]•  (C.22) 

Using  equation  (C.14),  this  result  becomes 

log[-log  A(ar)]  +  -  =  j.  (C.23) 
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Thus,  we  have 


log[-log  A(x)]  =  -- ^  d  +  j.  (C.24) 

Hence,  for  case  (1)  of  equation  (C.9)  to  hold,  log[-log  A(x)]  must  be  linear  in  x. 

We  now  solve  for  the  sequence  {Bd).  For  this  purpose,  let  d  =  pq  where  p  and  q  are  both 
integers.  Note  that 

Apq(x  +  Bpq)  =  A(a?).  (C.25) 

From  the  above  equation  we  get 

A(x  +  Bpq)  =  Am(i) 

=  [Ap(z)]«  =  [A(®  +  Bp]i 

=  A*{x  +  Bp)  =  A((x  +  Bp)  +  Bq)  =  \(x  +  Bp  +  Bq).  (C.26) 
Equation  (C.26)  implies  that 

Bpq  =  BP  +  Br  (C.27) 

We  now  determine  the  functional  dependence  of  the  sequence  { Bd }  on  the  subscript  d.  To 
emphasize  this  functional  dependence,  we  rewrite  equation  (C.27)  as 

B(pq )  =  B(p )  +  B(q).  (C.28) 

From  the  above  equation,  it  is  clear  that  the  functional  dependence  is  logarithmic.  Thus,  the 
solution  for  Bd  is  given  by 

B(d)  =  Bd  =  log  d  (C.29) 

Substituting  equation  (C.29)  into  equation  (C.24)  yields 

log[~log  A(x)]  =  -x  +  j  (C.30) 

where  j  plays  the  role  of  a  location  parameter.  Hence,  without  loss  of  generality,  j  is  chosen  to 
be  zero.  The  above  equation  then  simplifies  to 

log[-log  A(x)]  =  -x.  (C.31) 
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Solution  for  A  (a;)  results  in 


A(x)  =  exp(— e-*). 


(C.32) 


Equation  (C.32)  is  the  solution  of  equation  (C.9)  for  case  1. 

C.1.2  Cases  2  and  3 

The  solutions  to  Cases  (2)  and  (3)  of  equation  (C.10)  and  (C.ll)  are  now  derived.  In  both 


cases  we  have 

Ad(A<<x)  as  A(x). 

(C.33) 

From  equation  (C.33)  we  get 

log  A(x)  =  d  log  A(i4dx). 

(C.34) 

Multiplying  through  by  a  minus  sign  and  taking  the  logarithm  of  both  sides,  we  obtain 

log[- 

-log  A(x)]  =  log  d  +  log[-log  A(A<<x)]. 

(C.35) 

As  in  case  1,  let 

g(x)  =  log[—log  A(x)]. 

(C.36) 

Then  equation  (C.35)  becomes 

g(x)  =  log  d -f  g(Adx). 

(C.37) 

Alternatively, 

X 

aij-J  =  log  d  +  g(x) 

(C.38) 

or  equivalently, 

X 

g(x)  =  -log  d-\-g{—). 

(C.39) 

Adding  equations  (C.37)  and 

(C.39)  results  in 

g(Adx)  +ff(j-d)  =  2ff(x)-  (C.40) 

The  solution  to  the  above  equation  is 


g(x)  =  ±k  log  x  for  x  >  0  (C.41) 
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and 


(C.42) 


g{x)  —  ±k  log  (- x )  for  x  <  0 
where  A:  is  a  positive  constant.  Use  of  equation  (C.36)  in  equations  (C.41)  and  (C.42)  yields 

log[— log  A(x)]  =  ±k  log  x  for  x  >  0  (C.43) 

log[-log  A(x)]  =  ±k  log  (-x)  for  x  <  0.  (C.44) 

For  Case  2,  A  =  0  when  x  —  0.  This  implies  x  =  0  is  the  lower  end  point  of  A(x).  Hence,  A(x) 
is  nonzero  for  x  >  0.  Therefore,  our  solution  is  given  by  equation  (C.43)  where  we  must  choose 
the  sign  in  front  of  k  to  be  negative.  Then 

log[— log  A(x)]  =  —k  log  x  x  >  0  (C.45) 

which  results  in 

A(x)  =  exp(—x~k)  x  >  0.  (C.46) 

For  case  3,  A  =  1  when  x  =  0.  This  implies  that  x  =  0  is  the  upper  endpoint  of  A(x).  Hence, 
A(x)  is  nonzero  for  x  <  0.  Consequently,  the  solution  is  given  by  equation  (C.44)  where  we 


choose  the  sign  in  front  of  k  to  be  positive.  Then 

log[— log  A(x)]  =  k  log(—x)  x  <  0  (C.47) 

resulting  in 

A(x)  =  exp(— (— x)fc)  x  <  0.  (C.48) 

Thus,  the  three  possible  limiting  forms  for  the  distribution  A(x)  that  arise  as  solutions  to 
equation  1  are  given  as  follows: 

1)  A(x)  =  exp(— e~x)  (C.49) 

2)  A(x)  =  exp(—x~k)  x  >  0  ,  k  >  0  (C.50) 

3)  A(x)  =  exp(—(—x)k)  x  <  0  ,  A:  >  0.  (C.51) 
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C.2  Tails  of  Probability  Density  Functions 

Equations  (C.49-C.51)  represent  the  the  three  possible  limiting  forma  of  the  distribution  func¬ 
tion  for  almost  all  smooth  and  continuous  probability  density  functions.  By  differentiating  the 
three  functions,  we  obtain  the  three  possible  limiting  forms  for  the  probability  density  funct  ions 

themselves. 

C.2.1  Case  1 

The  derivative  of  A(x)  is  given  by 

H(x)  —  -~K{x)  =  exp(— e~x).(— e~x)(— 1)  =  e~*exp(— e~x)  =  exp(— x  —  e~x).  (C.52) 

In  our  application  we  are  interested  in  the  right  tail  of  the  probability  density  function.  Since 
we  have  to  set  thresholds  corresponding  to  small  false  alarm  probabilities,  the  thresholds  will  be 
in  the  right  tail  of  the  probability  density  function.  When  x  is  very  large,  x  e~x.  Therefore, 
equation  (C.52)  can  be  simplified  to  obtain  the  PDF  of  the  tail  as 

H(x)  =  e~x  x  large.  (C.53) 

C.2. 2  Case  2 

The  derivative  of  A(x)  is  given  by 

H(x)  =  ~\(x)  =  exp(— x~k).(kx~k~1) 

dx 

=  k  exp(—x~k)e^~k~1^l°9  =  k  exp(—x~k  —  (k  +  l)/o<7  x).  (C.54) 

When  x  is  very  large  log  x  »  x~k .  Therefore,  equation  (C.54)  can  be  simplified  to  obtain  the 
PDF  of  the  tail  as 


H(x)  =  ke~lk+1V°9  x  =  jbx"(*+1)  x  >  0,  x  large  k>  0.  (C.55) 

C.2. 3  Case  3 

The  derivative  of  A(x)  for  this  case  is  given  by 

H{x)  =  ^A(x)  =  exp(—(—x)k).(k(—x)k~1) 

—  kexp{—(—x)k)e^k~^hg^~x^  =  kexp(—(—x)k  +  (k  —  l)logx).  (C.56) 
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When  —x  is  very  large,  (— x)fc  >•  log  x.  Therefore,  equation  (C.56)  can  be  simplified  to  obtain 
the  PDF  of  the  tail  as 


H(x)  =  ke~(~x)k  x  <  0,  —  x  large  k  >  0. 


(C.57) 


A  basic  assumption  in  the  above  development  is  that  successive  trials  are  independent.  This  led 
to  equation  (C.l).  In  practice,  as  n  becomes  large,  it  may  be  difficult  to  ensure  the  independence 
of  successive  trials.  To  the  extent  that  the  assumption  holds,  the  results  in  equations  (C.49-C.51) 
are  valid. 

C.3  PDF  of  the  rth  Ordered  Statistic 

Suppose  that  the  ordered  samples  Xj <  Xa  <  ...  <  X„  are  drawn  from  the  distribution  function 
F(x).  Let  us  further  assume  that  the  trials  used  to  draw  the  samples  from  the  distribution  are 
independent.  Consider  the  rth  ordered  statistic  Xr.  Recall  that  P(XT  <  i)  is  the  distribution 
function  of  Xr.  This,  in  turn,  is  the  probability  that  at  least  r  of  the  X-s  are  less  than  or  equal 
to  x.  Treating  this  as  a  Binomial  problem,  the  distribution  function  is 

FXr(x)  =  P(X,  <x)~p  0,f(*)[l  -  F(*)]"-‘  (C.58) 


where  the  ith  term  in  the  summation  is  the  binomial  probability  that  exactly  i  of  Xi,X2, ...,  Xn 
are  less  than  or  equal  to  x.  Using  integration  by  parts,  it  can  be  shown  that  equation  (C.58) 
can  be  represented  in  terms  of  integral 

"  (r  -  l)?(n  -  r)l  C  ^  ~  ^  (C'59) 


The  probability  density  function  of  the  rth  ordered  statistic  is  the  derivative  of  Fxr(x)  and  is 
given  by 


(r  --  l)!(n  —  r)! 


dt 


n! 


(r  —  l)!(n  —  r)! 


Fr“1(a:)[l-F(a:)]n-r/(a:) 


(C.60) 


where  f(x)  =  -^F(x).  Equation  (C.60)  represents  the  general  form  of  the  PDF  of  the  rth  ordered 
statistic.  If  F(x)  is  known,  then  the  mean  and  the  variance  of  the  rth  ordered  statistic  can  be 
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calculated.  The  expected  value  of  Xr  is  given  by 

E(X')  -  fl  ~  F(x)r-' /(*)<>*■  (C.61 ) 

An  alternate  form  for  the  expected  value  of  Xr  can  be  obtained  by  letting  u  =  F(x).  Therefore, 
x  —  F~x(u).  The  infinite  limits  of  the  integral  in  the  above  equation  then  becomes  finite  after 
the  transformation.  The  transformed  integral  is 

E w ) = pnfcy  £  -  u>n''rfu-  <C62> 

The  variance  of  the  rth  ordered  statistic  is  expressed  as 

Var{Xr)  =  E[(Xr  -  E(Xr))2}  =  E(X r2)  -  E2{Xr).  (C.63) 

Making  use  of  equation  (C.60),  E(X2)  can  be  written  as  follows. 

=  ^ri)|rr7)T  jT  -  F(*)]"-7(*)<fe.  (c.64) 

An  alternate  form  for  the  expected  value  of  XT  can  be  obtained  by  again  letting  u  =  F(x).  We 
then  get 

=  (r~  ife  ,7)|  JjF-'Wfu'-'ll  -  nT-du.  (C.65) 

The  variance  of  Ar  can  be  calculated  from  equations  (C.62)  and  (C.65)  when  F-1(u)  is  known. 
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